Bibliography

[1] T. Agerwala and J. Cocke, High performance reduced instruction set processor, "IBM Tech Rep", 1987.

[2] G.M. Amdahl, G.A. Blaauw, and F.P. Brooks Jr., Architecture of the IBM System 360, "IBM J. Res. Dev.", 8/2, 1964, pp. 87-101.

[3] T.E. Anderson, D.E. Culler, and D. Patterson, A case for NOW (networks of workstations), "IEEE Micro", 15/1, 1995, pp. 54-64.

[4] J. Archibald and J.L. Baer, Cache coherence protocols: Evaluation using a multiprocessor simulation model, "ACM Trans. Comput. Syst.", 4/4, pp. 273-298.

[5] J.V. Atanasoff, Computing machine for the solution of large systems of linear equations, Internal Report, Iowa State University, Ames, 1940.

[6] D. Bhandarkar and D.W. Clark, Performance from architecture: Comparing a RISC and a CISC with similar hardware organizations, in "Proceedings of the Fourth Conference on Architectural Support for Programming Languages and Operating Systems (Palo Alto, April)", IEEE/ACM, 1991, pp. 310-319.

[7] G. Bell, R. Cady, H. McFarland, B. DeLagi, J. O’Laughlin, R. Noonan, and W. Wulf, A new architecture for mini-computers: The DEC PDP-11, in "Proceedings of AFIPS SJCC", 1970, pp. 657-675.

[8] W.J. Bouknight, S.A. Deneberg, D.E. McIntyre, J.M. Randall, A.H. Sameh, and D.L. Slotnick, The Illiac IV system, in "Proc. IEEE", 60/4, 1972, pp. 369-379.

[9] I.V. Bucher and A.H. Hayes, I/O performance measurement on Cray-1 nad CDC 7000 computers, in "Proceedings of the Computer Performance Evaluation Users Group", 16th Meeting, NBS 500-65, 1980, pp. 245-254.

[10] W. Bucholtz, Planning a Computer System: Project Stretch, McGraw-Hill, New York, 1962.

[11] A.W. Burks, H.H. Goldstine, and J. von Neumann, Preliminary discussion of the logical design of an electronic instrument, in "Papers of John von Neumann", W. Aspray and A. Burks (eds.), MIT Press, Cambridge, and Tomash Publishers, Los Angeles, 1987, pp. 97-146.

[12] P.M. Chen, E.K. Lee, G.A. Gibson, R.H. Kats, and D.A. Patterson, RAID: High-performance, reliable secondary storage, "ACM Computer Surv.", 26/2, 1994, pp. 145-188.

[13] E.S. Davidson, A.T. Thomas, L.E. Shar, and J.H. Patel, Effective control for pipelined processors, in "COMPCON", IEEE, San Francisco, 1974, pp. 181-184.

[14] D.R. Ditzel and D.W. Clark, Retrospective on high-level language computer architecture, in "Proceedings of the Seventh Annual Symposium on Computer Architecture (LeBaule, France, June)", 1980, pp. 97-104.

[15] J.R. Ellis, Bullog: A Compiler for VLIW Architectures, MIT Press, Cambridge, 1986.

[16] J.A. Fischer, Very long instruction word architectures and ELI-512, in "Proceedings of the Tenth Symposium on Computer Architecture (Stokholm, June)", 1983, pp. 140-150.

[17] M. Golden and T. Mudge, A comparison of two common pipeline structures, in "Institution of Electrical Engineers Proceedings – E, Computers and Digital Techniques", 1996.

[18] H.H. Goldstine, The Computer: From Pascal to von Neumann, Princeton University Press, Princeton, 1972.

[19] E.A. Hauck and B.A. Dent, Burroughs B6500-B7500 stack mechanism, in "Proceedings of AFIPS SJCC", 1968, pp. 245-251.

[20] J.P. Hayes and T.N. Mudge, Hypercube supercomputers, in "Proc. IEEE", 77/12, 1989, pp. 1829-1841.

[21] J. Hennessy, VLSI processor architecture, "IEEE Transaction on Computers", 33/11, 1984, pp. 1221-1246.

[22] A.S. Hoagland, Digital Magnetic Recording, John Wiley & Sons, New York, 1963.

[23] J.H. Holland, A universal computer capable of executing an arbitrary number of subprograms simultaneously, in "Proceedings of the East Joint Computer Conference", 16, 1959, pp. 108-113.

[24] A.D. Hospodor and A.S. Hoagland, The changing nature of disk controllers, in "Proc. IEEE", 81/4, 1993, pp. 586-594.

[25] W.M. Hwu and Y. Patt, HPSm, a high performance restricted data flow architecture having minimum functionality, in "Proceedings of the Thirteenth Symposium on Computer Architecture (Tokyo, June)", 1986, pp. 297-307.

[26] B. Jacob, P. Chen, S. Silverman, and T. Mudge, An analytical model for designing memory hierarchies, "IEEE Transaction on Computers", 1996.

[27] M. Johnson, Superscalar Microprocessor Design, Prentice-Hall, Englewood Cliffs, 1990.

[28] N.P. Jouppi and D.W. Wall, Available instruction-level parallelism for superscalar and superpipelined processors, in "Proceedings of the Third Conference on Architectural Support for Programming Languages and Operating Systems", IEEE/ACM, Boston, 1989, pp. 272-282.

[29] T. Kilburn, D.B.G. Edwards, M.J. Lanigan, and F.H. Sumner, One-level storage system, in "IRE Trans. Electr. Comput. EC", 11, 1962, pp. 223-235.

[30] P.M. Kogge, The Architecture of Pipelined Computers, McGraw-Hill, New York, 1981.

[31] D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, and J.L. Hennessy, The Stanford DASH multiprocessor, in "Proceedings of the Seventh International Symposium on Computer Architecture (Seattle, June)", 1990, pp. 148-159.

[32] T. Lovett and S. Thakkar, The Symmetry multiprocessor system, in "Proceedings of the 1988 International Conference of Parallel Processing", University Park, 1988, pp. 303-310.

[33] L.F. Menabrea, Sketch of the Analytical Engine Invented by Charles Babbage, Bibliotheque Universelle de Geneve, 1842.

[34] O.A. Olukotun, T.N. Mudge, and R.B. Brown, Performance optimization of pipelined primary caches, in "Proceedings of the Nineteenth Annual International Symposium on Computer Architecture", 1992, pp. 181-190.

[35] D. Patterson, Reduced instruction set computers, in "Commun. ACM", 28/1, 1985, pp. 8-21.

[36] D.A. Patterson, G.A. Gibson, and R.H. Katz, A case for redundant arrays of inexpensive disks (RAID), in "ACM SIGMOD Conference Proceedings, Chicago, June 1-3, 1988", 1988, pp. 109-116.

[37] S.A. Przybylski, Cache Design: A Performance-Directed Approach, Morgan-Kaufmann Publishers, San Mateo, 1990.

[38] J.T. Schwartz, Ultracomputers, "ACM Trans. Program. Lang. Syst.", 4/2, 1980, pp. 484-521.

[39] D. Seitz, The cosmic cube, in "Commun. ACM", 28/1, 1985, pp. 22-31.

[40] D.L. Slotnick, W.C. Borck, and R.C. McReynolds, The Solomon computer, in "Proceedings of the Fall Joint Computer Conference (Philadelphia, December)", 1962, pp. 97-107.

[41] A.J. Smith, Cache memories, "Comput. Surv.", 14/3, 1982, pp. 473-530.

[42] A.J. Smith, Disk cache-miss ratio analysis and design considerations, "ACM Trans. Comput. Syst.", 3/3, 1985, pp. 161-203.

[43] J.E. Smith, A study of branch prediction strategies, in "Proceedings of the Eight Symposium on Computer Architecture, Minneapolis", 1981, pp. 135-148.

[44] M.D. Smith, M. Horowitz, and M.S. Lam, Efficient superscalar performance through boosting, in "Proceedings of the Fifth Conference on Architectural Support for Programming Languages and Operating Systems (Boston, October), IEEE/ACM, 1992, pp. 248-259.

[45] R.J. Swan, S.H. Fuller, and D.P. Siewiorek, Cm* – A modular, multi-microprocessor, in "Proceedings AFIPS National Computer Conference", 46, 1977, pp. 637-644.

[46] J.E. Thornton, Parallel operation in Control Data 6600, in "Proceedings of the AFIPS Fall Joint Computer Conference", 26, part II, 1964, pp. 33-40.

[47] G.S. Tjaden and M.J. Flynn, Detection and parallel execution of independent instructions, "IEEE Transaction on Computers", 19/10, 1970, pp. 889-895.

[48] R.M. Tomasulo, An efficient algorithm for exploiting multiple arithmetic units, "IBM J. Res. Dev.", 11/1, 1967, pp. 25-33.

[49] W.R. Touma, The Dynamics of the Computer Industry: Modelling the Supply of Workstations and Their Components, Kluver Academic, Boston, 1993.

[50] M. Upton, T. Huff, T. Mudge, and R. Brown, Resource allocation in a high clock rate microprocessor, in "Sixth International Conference on Architectural Support for Programming Languages and Operating Systems", ASPOLS-VI, 1994, pp. 98-109.

[51] M.W. Wilkes, Memoirs of a Computer Pioneer, MIT Press, Cambridge, 1985.

[52] M.W. Wilkes, Computing Perspectives, Morgan-Kaufmann, San Francisco, 1995.

[53] D.A. Wood and M.D. Hill, Cost-effective parallel computing, in "IEEE Comput.", 28/2, 1995.