Huh. I thought it was obvious: e.g., the top-left corner of the screen is (0, 0), the bottom-right corner is (640, 480), and pixels are 1x1 — we take the coordinate of their top-left corner to be their coordinate, so the top-left pixel has coordinates (0, 0), and the bottom-right one has (639, 479), and obviously the coordinates of their centres are (0.5, 0.5) and (639.5, 479.5).
But apparently some people treated pixels as geometrical points without size?
It's normal in signal processing to treat samples (whether audio or image) as points. A lot of things stop making sense if you replace them with line segments and rectangles.