I/O is about channels. Channels are fixed bandwidth queues which generally have a 'signaling' rate, and a 'bandwidth' the one that most people are familiar with is a SATA port.
A SATA port can do some number of IOs per second (this depends on how the controller is designed for the most part) and the SATA data pipe is 6 gigabits/second or 600 megabytes per second these days.
A SATA Disk (the spinning rust kind) is different, it can pump out some maximum amount of data per second (usually about 100 Mbytes/sec and has an average seek time of a few milleseconds. So if you are doing random i/os to the disk, so that every I/O needs a seek, and the seek takes 10 mS, you can do at most 100 I/O operations per second (IOPS) if you transfer 512 bytes with each IOP you get a throughput of 100* .5K or 50K bytes per second. If you transfer 100Kbytes per transaction you get 10 megabytes per second. Anyway the bottom line is that an I/O channel is bound both by its bandwidth and the number of IOPS it can do.
A PCIe 'lane' can do 250Mbytes per second and with Intel's 5500 chipset about 2 million IOPs. (Although those IOPs are spread across all PCIe channels) A "big" server chip set will give you 32 or maybe 48 PCIe "lanes" which can be 12 by 4 (or x4) lanes or 2x16 + 8x1, or 2x16, 1x4 and 4x1 (this is a common desktop config for two video cards and misc peripherals)
The bottle neck is exactly like networking, each I/O device can consume a certain amount of channel bandwidth, since they don't run 24/7 you can "oversubscribe" which is to say potentially have more I/O devices than you have bandwidth to service them (this happens a lot in like big disk arrays) and ultimately the transaction rate of the entire system is constrained by the cross sectional bandwidth of memory and I/O channels such that you get I/O limited in terms of things you can push around.
Anyway, a mainframe has thousands of channels. Lots and lots of them. And each of those channels has a lot of raw bandwidth and can sustain a high transaction rate. This makes a mainframe a transaction monster.
In the taxonomy of computer systems, 'supercomputers' have a monster amount of memory bandwidth and high transaction rates (low latency) between their compute nodes (or cores). Mainframes have a monster amount of I/O bandwidth with high transaction rates between their nodes and I/O devices. General purpose computers achieve good economics by limiting memory and I/O bandwidth to stuff that will fit in a single piece of silicon. (or at most a couple of pieces)
A SATA port can do some number of IOs per second (this depends on how the controller is designed for the most part) and the SATA data pipe is 6 gigabits/second or 600 megabytes per second these days.
A SATA Disk (the spinning rust kind) is different, it can pump out some maximum amount of data per second (usually about 100 Mbytes/sec and has an average seek time of a few milleseconds. So if you are doing random i/os to the disk, so that every I/O needs a seek, and the seek takes 10 mS, you can do at most 100 I/O operations per second (IOPS) if you transfer 512 bytes with each IOP you get a throughput of 100* .5K or 50K bytes per second. If you transfer 100Kbytes per transaction you get 10 megabytes per second. Anyway the bottom line is that an I/O channel is bound both by its bandwidth and the number of IOPS it can do.
A PCIe 'lane' can do 250Mbytes per second and with Intel's 5500 chipset about 2 million IOPs. (Although those IOPs are spread across all PCIe channels) A "big" server chip set will give you 32 or maybe 48 PCIe "lanes" which can be 12 by 4 (or x4) lanes or 2x16 + 8x1, or 2x16, 1x4 and 4x1 (this is a common desktop config for two video cards and misc peripherals)
The bottle neck is exactly like networking, each I/O device can consume a certain amount of channel bandwidth, since they don't run 24/7 you can "oversubscribe" which is to say potentially have more I/O devices than you have bandwidth to service them (this happens a lot in like big disk arrays) and ultimately the transaction rate of the entire system is constrained by the cross sectional bandwidth of memory and I/O channels such that you get I/O limited in terms of things you can push around.
Anyway, a mainframe has thousands of channels. Lots and lots of them. And each of those channels has a lot of raw bandwidth and can sustain a high transaction rate. This makes a mainframe a transaction monster.
In the taxonomy of computer systems, 'supercomputers' have a monster amount of memory bandwidth and high transaction rates (low latency) between their compute nodes (or cores). Mainframes have a monster amount of I/O bandwidth with high transaction rates between their nodes and I/O devices. General purpose computers achieve good economics by limiting memory and I/O bandwidth to stuff that will fit in a single piece of silicon. (or at most a couple of pieces)