High-Resolution 3D Human Digitization

sdan · on June 14, 2020

I've tried many of these papers, and they don't work outside their datasets.

Suppose you tried this on your own camera or maybe in a different lighting: it would totally break this.

Paper definitely seems like an improvement, but I guess no one should get excited to use this in experimentation/production.

Jarred · on June 14, 2020

I've tried a number of ML projects like this too, and sometimes they do work – Bodypix is one example (though not precisely the same thing).

That being said, its usually pretty difficult to get code from papers to build/run successfully due to dependencies (e.g. depending on a specific version of Python and OpenCV 2 and requiring CUDA support)

yboris · on June 14, 2020

Could anyone explain why version numbers make ML stuff so brittle? Why is the CUDA version supposed to match? Why would a newer version of python (3.7 vs 3.6) would ever break anything?

TensorFlow is a crapshoot as far as I understand, constantly changing making newer versions incompatible; but why do other libraries break backwards compatibility without a major version bump?

sansnomme · on June 14, 2020

Because dependency management for anything C/C++/Python related has often been a massive cluster* * * *. Rust's npm style package management is probably the greatest innovation to have hit low level programming in a long time. Not to mention anything Nvidia related is often closed sourced so you are programming against a black box API by a vendor infamous for their bad driver quality on non-Windows systems.

aprdm · on June 14, 2020

Because software is hard.

You might be relying on a function that only exists in 3.7 and not in 3.6, code written in 3.6 would work but new code using 3.7 features won’t be backwards compatible. With compiled code the errors are usually very hard for “more used to scripts“ people to decode. You get stuff like missing symbols in the linker phase.

ML projects usually have a lot of libraries so you also get in the transient dependencies breaking quite often...

ryukafalz · on June 14, 2020

Yeah, managing dependencies is tricky. I've been super excited about Nix and Guix lately for that reason; if you have a single Nix/Guix revision and a list of packages, you have all the information you need to build exactly the same package tree. (With bit-for-bit reproducibility where possible, no less!)

Some language-specific package managers can do similar things, but you really only get reproducibility for the whole system with a general-purpose package manager. Poetry gets you pretty far within the Python ecosystem, but if you need a specific version of Python/specific native libraries/etc... it doesn't get you all the way there.

cardboard-q · on June 14, 2020

I tried to see what would happen when I put in some stock photos of people into the Colab notebook linked from the repo's README. Some worked better than others. Some totally didn't resolve well at all. Overall I think it's interesting, but there are definitely a lot of edge cases.

Some gifs of the test results: https://github.com/cardboard-q/pifuhd_demo_model_test

krick · on June 14, 2020

"yoga" example is quite disturbing.

aspenmayer · on June 14, 2020

For what it’s worth, this project did get an update about a month ago. I can think of some cool uses for this if it works.

https://colab.research.google.com/drive/1GFSsqP2BWz4gtq0e-nk...

2dvisio · on June 14, 2020

Interesting cool applications from when I was a PhD student [1,2].

[1] http://www.eurecom.fr/en/publication/3189?&theme=mobieurecom

[2] http://www.eurecom.fr/en/publication/3247/copyright?popup=1

belval · on June 14, 2020

CVPR 2020 (starting today!) has a lot of papers on the topic for interested people.

Here is PiFuHD: https://arxiv.org/abs/2004.00452

ilaksh · on June 14, 2020

The next thing might be mapping from that static 3D model to some kind of rigged model that can be animated. Then you would not only need to place the bones and joints but also separate the clothing from the body. Extremely hard but DL has been able to pull off some incredible stuff.

geeIncredible · on June 14, 2020

Wow. It's such "high resolution" that the distorted triangular polygons of the untextured models are clearly distinguishable and painfully obvious on my phone.

I guess my idea of "high resolution" differs from what the rest of the world describes as such.

saeranv · on June 14, 2020

It's interesting how the legs are often an unequal length in the reconstruction when the person is walking, due to some incorrect interpretation of the camera perspective.

I would have guessed this wouldn't be a problem given that there are multiple photos of the same view, such that you can resolve the depth ambiguity from a single camera. I.e use feature detection to identify feet in > 1 views, and then use the two images to resolve the depth of the epipolar lines that lie on the optical axis and reconstruct the shoe in 3D space.

zuhayeer · on June 14, 2020

Straight up en route to Ready Player One. Guess this remote thing is happening for real for real

sansnomme · on June 14, 2020

More like increased competition and computer generated avatars in the modeling and Instagram influencing industry.

sungam · on June 14, 2020

I am interested in high resolution 3D mapping of skin surface e.g. for facial scar revision etc - what is the best technology available for this either research or commercial?

hawflakes · on June 14, 2020

Pardon me for asking, but isn’t the name of the technique a sort of pun? 皮肤 means “skin” in Chinese and romanized as “pifu”

heinrichf · on June 14, 2020

The last author and his company were/are involved in scandals about faking results on a large scale:

https://www.reddit.com/r/MachineLearning/comments/8zm4kl/d_l...

http://sadeghi.com/dr-iman-sadeghi-v-pinscreen-inc-et-al/

hliyan · on June 14, 2020

Thanks, I think this deserves an HN post of its own. Some of the things that were done to entice Sadeghi to join sends shivers down my spine.

Edit: I've been looking at the details on the rear view of some of the "Single-View Reconstruction" examples, and I'm starting to worry that this may actually not be reproducible.

singh0chitra · on June 14, 2020

I have followed this case since the beginning. I am surprised how the academic community (who's fully aware of this case) continues reviewing his work without questioning work ethics as though nothing happened. TLDR:

Dr. Iman Sadeghi is the man behind hair rendering tech for Disney and Dreamworks. He left his job at Google to join Hao Li's company Pinscreen (which by the way is funded by big names like Softbank).

When Sadeghi saw red flags inside the company, he raised the issue from within, and finally wanted out. While he was leaving one day, Li and his colleagues legit assaulted (violently) Sadeghi to give up his company laptop. This, by the way, is recorded on CCTV cameras and can be viewed online.

The fraud case is about falsifying results in their SIGGRAPH 2017 Technical Papers submission. They claimed to generate avatar hair shapes in their paper, and when their reviewer asked them to give results on many faces, the company hired artists for as much as $100 to generate them manually for the results. Of course, they later claimed to make it fully automatic AFTER publishing the paper, but that doesn't justify the stance that they published false results in one of the biggest computer graphics conference. Hell even at that public demo, they showed pre-cached avatars and claimed them to be generated real-time.

kazinator · on June 14, 2020

[flagged]

yutopia · on June 14, 2020

Terrible grammar, but probably means something like the following:

"These guys were desperate to find hot white girlfriends, but nobody would date them so they resorted to writing this lewd software. I'd advise them to work on something useful instead."

I've found that the general rule is, if you see a Japanese-language comment on an English-language website it's most likely 1) not written by a Japanese and 2) offensive to varying degrees.

MauranKilom · on June 14, 2020

I don't understand the motivation for such comments, especially when (as you assert) written by a non-Japanese person. Who is the target audience?

yutopia · on June 14, 2020

My guess is that it’s an inside joke of some kind, targeted at English speakers with some knowledge of Japanese.

When I was a kid my classmates would often ask me to teach them dirty English slangs, which they’ll say out loud in front of teachers. They got a kick out of the fact that they can say naughty things without the adults noticing.

So yeah I think it’s grade school humor, fairly innocuous stuff but kind of weird to see on HN.

kazinator · on June 14, 2020

冗談じゃなくて、その研究者の目を覚ますように真面目に書きました。だから、文法が完璧じゃなくても、日本語で頑張りました。

ところで、文法が良くなくても、yutopiaさんがすんなり分かってくれてよかった。きっと、その研究者たちも分かるでしょう。ご確認してくれてありがとう。

（正直言えば、文法より単語に集中しています。何故なら、良くない文法で英語を喋る人が多い。それなのに、その人は、問題なく北米で暮らしています。しかも、医者やエンジニアー等として働きます。ですが、単語を持ってない人は、失語になります。折角の時に緊張して言いたい言葉言えなくなる。文法よりも単語は大事。）

yutopia · on June 16, 2020

So you say it wasn’t a joke, you wrote the original comment in Japanese because it was your sincere opinion and you wanted the researchers to understand.

Not sure I follow.

If that were the case you should have written in English, obviously. The paper is written in English, some of the authors have Japanese names but we don’t know if they’re native Japanese speakers, the last author (team leader) has a Chinese name, and everyone who does CS research at this level will understand English just fine.

kazinator · on June 16, 2020

たしかに、残念ながらその点で間違えてしまいそうでした。認めます。

> if my translation was fair.

だって、簡単に言えば、どんなに原典の文法が崩れても、出てこない言葉は出てきません。

sho_hn · on June 14, 2020

I'd guess other non-Japanese-speakers? Throwing the comment at Google Translate/Papago is easy enough, and might register as "there's a Japanese troll/spam problem". Maybe generating this impression is the objective? It'd be a form of nerd-sniping - if you feel satisfied with yourself over going the extra step of using a tool, you might stop thinking critically at that point and be susceptible to being duped.

kazinator · on June 14, 2020

That the grammar is "terrible" is an exaggeration; I have decent confidence.

Although the main gist of it is right, the translation is embellished, and I suspect deliberately so.

The tone is deliberately harsh. "lewd" is usually (and certainly in this context) not a good translation of エッチ; "lewd" is a negative word that is better corresponds to concepts like いやらしい or maybe 淫らな. Nothing like "nobody would date them" is written or implied, just that they can't find that type of woman (そういう女性) to date them, for whatever reason. It doesn't imply they don't have girlfriends, or even that a woman such as what they modeled in the software wouldn't want them; no speculation is posited about why that woman isn't there. 付き合ってくれる人がいない does not mean "nobody would date s/o". I also didn't use any phrase that corresponds to "desperate" like 窮余 or 絶望 or what have you.

victor106 · on June 14, 2020

いう付き合ってくれ