Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: I am a high school student and I did research on 3D adversarial attacks (arxiv.org)
116 points by c0deb0t on Oct 3, 2019 | hide | past | favorite | 34 comments


If you have any questions, I (the first author of the paper) will be more than happy to answer them. This is my second work in a series on adversarial attacks/defenses in 3D space (first paper [1]).

A high-level overview of the research:

Basically neural networks are weak against adversarial attacks that change the input by a little bit to cause the prediction to be wrong. We look at these adversarial attacks in 3D space, specifically on 3D point clouds (think LiDAR and RGB-D data). In the paper, four attacks in two different categories (distributional and shape attacks) are proposed. The main benefit of distributional attacks is their imperceptibility. On the other hand, shape attacks are more easily crafted in real-life (though more perceptible) and also robust against point removal defenses that were proposed in previous work. If you want a more comprehensive (but less dense than the paper) overview, take a look at my blog post [2].

[1] https://arxiv.org/abs/1901.03006

[2] https://blog.liudaniel.com/birth-of-a-new-sub-sub-field


The field is in a really sad state. Children pocking holes in the state-of-the-art. ;) .

Joking aside, fantastic work. The differentiable reformulation in the distributional attack is a tour de force.

A question. Why did you do it?


I don't really think its fair to say that the "field is in a sad state". Plenty of insightful and well-written papers are put out everyday by hardworking and intelligent people. I still have a long way to go.

I do research because I like solving hard problems that people have never considered. I like to ensure that what I have learned will be put into practical use.


And for example, solving 1-1 Starcraft problems wouldn't be a problem that people have never considered. And can't be put to practical use. So you wouldn't do it? I'm genuinely curious, what makes a difference and motivates to engage into research, rather than playing challenging games.


Well, I used to play a lot of competitive FPS games because I found it fun. I have also done competitive programming problems for fun/accolades. But after doing more practical research, I realized it felt better to do impactful stuff (especially getting recognized). Also, research is nice because I perform terrible at short events (games, contests) under pressure. I think that if I tried something else before research that met the same criteria I probably wouldn't have done research.


Stupid q: how is this different than all of the other neural network adversarial attack papers that have come out recently?

Why would 3d not be a subcase of that work?


Many previous algorithms (adversarial training, distillation, most attacks, etc.) can be used in 3D in a fairly straightforward manner as they are architecture-agnostic. However, they do not make use of specific properties that are present in 3D point sets and the 3D neural networks. For example, removing points as an attack or a defense is specific to point sets; you cannot really remove pixels in an image. The distribution of points in a point cloud also gives us information that can be used in defenses, but the attacker can also tamper with it (this is partially the focus of this work).

Similarly, adversarial attacks/defenses are still being proposed for graphs, audio, and other domains because we can leverage domain-specific knowledge.


> The distribution of points in a point cloud

Would you have a canonical name for this distribution ? If you try matching log likelihoods, what parametric family does it resemble ? Briefly, given one of the canonical two dozen (uni/multi)variate distribution, one can create new distributions either by location-scale transform, mixtures, or say by using a k-param EFD family. So if I pick a k-param MVN ( multivariate normal with k means, k sigmas & O(k^2) correlations, I can create new distributions all day long by tweaking these 2k+k^2 params until cows come home. Brittle inference engines such as CNNs trained on a specific family with specific (hyper)parameters will fail once the distribution changes significantly, though visually the changes will be imperceptible.


Sorry, I'm not too knowledgeable of the math part of the distribution. Usually, we would want some set of points from the surface of the object that maximizes the distance between each point and its nearest neighbors. Then, the points would be distributed with uniform density across the surface.

In my previous paper, I've shown that moving the points around on the surface of an object does lead to imperceptible but effective adversarial attacks, as you've observed.


no worries. On your github you have all the point clouds, so I’ll give it a shot one of these days. If you mathematize the distribution, you get a lot more mileage for your results because you get interpretation for free. Changing moments (skew etc) will sufficiently alter the dist while being imperceptible visually.


>you cannot really remove pixels in an image

I'm unconvinced by this statement. There are many attempts to negate attacks that do so by applying linear transformations, masks, etc. To images. Removing pixels is not novel.

We like to imply that domain knowledge is relevant but after you design a feature vector it all ends up the same.


The specific feature vector statement doesn’t hold for audio (at least).

The time dimension adds complexity to the problem as the optimal values for the perturbation vary depending on both the immediately surrounding values, and many of the values beforehand.

When I say “hello world”, the fact I said “e” depends on the fact I said “h”. “L” depends on both “e” and “h”... etc etc.

Adds an extra dimension to the problem.

Also, distance metrics for images aren’t ideal for audio, for many reasons. That’s why audio signal processing is a different sub field vs image processing.

The approaches are similar, but we have to use different things in the end because audio behaves differently to images. Eg feature extraction through MFCC is a variant of Fourier, but specifically tailored for the human ear.

E.g. Lea Schonherr et al.’s really good Psychoacoustic attack paper.

On the negation of attacks through transforms - important to remember that an ensemble of weak defences are not strong. Many attacks have been shown to be robust to simple transformations.


Yes, there are similar ideas to removing points, like masks and other transformations. Removing points is merely a 3D equivalent of the idea of destroying potentially adversarial information. I guess you can "remove" a pixel by setting it to a certain color, so my statement is not entirely accurate. However, point-removal methods are able to take into consideration the distribution of points, which is unique to 3D point sets. Furthermore, there are a lot of redundant points on the surface of an object, which means that removing a few points will not destroy the shape information.

This paper does suggest that we can circumvent certain domain-specific knowledge when attacking. This does not mean that we won't discover methods to utilize domain-specific knowledge in the future. I would imagine extending current provably robust methods to 3D would require domain-specific knowledge to deal with the distribution of points.


Any proposed methods of attack mitigation comes to mind after this research? I would suspect, and take this with a grain of salt, I haven't been working with machine learning for quite some time, that smaller subsets of the same data each representing a different kind of attack mapped to another dimension in the network output (that would be then recurrently fed to the same network) could prevent the issue. Kind of teaching the network to defend itself, recognising patterns of attacks and discarding or compensating them as it would noise or other kind of interventions.

Are the attacks statistically identifiable? Can they be translated to a training subset? What would you propose?


Feeding examples generated by an attack back into the network is a very classical defense mechanism. This works ok, but it is not attack-agnostic, and removing adversarial points works better in 3D. There are also works (mostly in 2D) on detecting adversarial examples with neural networks.

I am not sure about statistical identification, but we show that it is difficult to identify and remove adversarial points by looking for statistical outliers points.

I am not sure about truly robust 3D-specific defenses---if anyone has some idea, I am open to collaboration. I would imagine some sort of provably robust method built specifically to handle the varying density and distribution of points.


This is not at all related to the work you did, but nevertheless I have a question:

As a fellow high school student who has a good amount of experience and knowledge in deep learning, how would you recommend me to move forward? I am struggling to find opportunities to show off my work and knowledge, and would like some advice.


If you want research experiences, then try to email professors directly. Summarize your experiences and how it aligns with the professors' interests.

Otherwise, try reimplementing algorithms and blogging about them. Do fun projects like deploying a model online or to phones. If you are a fan of competitions, then you can try some Kaggle competitions. With some projects (if you say you have experience then you probably already done some) it should not be hard to get research experiences because you have something to show off. Remember to post your projects on Reddit and HackerNews to get internet points and encouragement! It is quite motivating.


How did you get connected with the university researchers whom you collaborated with on this work?


Cold email plus a bit of luck, and then an informal interview in person. My best advice for anyone wanting to get connected is to not be shy. To get opportunities and connections, you have to put yourself out there. This comes with the risk of criticism.


> My best advice for anyone wanting to get connected is to not be shy.

That’s where I (fellow High School student) struggle. I’m not really at a point where I can contribute much either way.


There is one thing I forgot in my advice: you need to have a good foundation first. For example, when I cold emailed, I had many programming projects and I wrote about them in the email. But after you have stuff done, I think aggressively (but politely) "marketing" your work is important to get those opportunities.


> plus a bit of luck

What was that luck?


Got an internship at Berkeley AI's Lab this summer as a HS student.

Here's what I think when he says "luck": I emailed 50+ personalized emails over a span of 2-3 months to professors and anyone else I wanted to work with. Most didn't respond, but for those who did, you'll need some luck and networking skills (which I hope have improved) to convince them you're the right fit or can help without hassle.

Specifically luck includes: whether or not that person had a good day or had enough time to check their email, or numerous other things


Nice hustle! How did you decide to start in CS research? Do you attend a math & sciences HS, or have family members in academics? Or perhaps just from reading the internet?


OpenAI mostly. Started following along what they're doing and eventually spiraled into going to one of their meetups and then attending ICML this summer.

I'm not particularly a genius, but deeply passionate and curious on what I do. As a result, doing research or an internship was more fun than most other things I'd be doing over the summer/fall.

Didn't grow up in the best of situations, but learned how lucky I was to live in the Bay, giving me access to meet people IRL and grow from there.


> I'm not particularly a genius, but deeply passionate and curious on what I do

The two are often indistinguishable from each other.

You have a very wise perspective for your age. Good luck to you.


I think I've spent too much time reading PG's essays :).


Awesome dude are you planning to become an undergraduate?


Yep, and it is college application season!


I did not read the paper fully, just some general commentary, hopefully useful.

First, fantastic that you're doing academic work so early. I believe most students wait far too long to be exposed to this aspect of academia (and their lives), which is far more about asking good questions, achieving deep understanding, and getting good results, than memorizing some procedure that isn't necessarily useful (but happens often in school). Keep creative and keep working on creating a good toolset (find math tools you find interesting/useful and own them!).

Now on to the paper. I'd reiterate the importance of asking good questions almost above results. For example, the result of adversarial sticks and sinks looks good -- but is it asking the right question? If you think realistically, adversarial attacks can occur in a number of ways.

One of them is that a human classifies a dataset one way while a machine another. In this case you would also want the human not to be able to tell you data is weird or there's something funky going on -- that is clearly the case with sticks (and sinks to a lesser extent). A human could be easily trained to spot them, and generally tell something weird is going on.

Another attack scenario is where you can modify some object, like a picture, but have some restriction on how much you can modify it. For example, you can manipulate only some bits of an image, or only perturb a small part of a real-world object that is under classification (say by putting a sticker on a car and fooling a system into thinking it is a dog, or something). If there is no restriction on your perturbation, this problem would be trivial (just replace the data with intended object data). The justification behind sticks and sinks does not look very well fundamented.

So sticks/sinks do not fare too well in either case, despite looking very good in terms of success vs defenses (although there's a chance they could inspire more practical attacks).

The commentary on Haussdorf distance is relevant here, but only on the first case (fooling human judgement), and it is of course an imperfect proxy (the true metric is human perception) -- another hint that fundamentals (and applications) are important to keep in mind.

Overall the paper seems well written and I specially like the numerous illustrations.

Keep the good work and don't forget to always look for the inspiring, beautiful and impactful, and seeking understanding. With a little of this in mind I have no doubt you can achieve very much. Good luck!


Thanks for your comments and encouragements!

The success rates of the attacks are not really emphasized---we only show that they work. They provide other benefits like robustness against point removal defenses.

The attacks are optimized for different criterion. The sticks attack is supposed to be easy to construct, at the cost of perceptibility. The distributional attack is more geared towards imperceptibility. Indeed, we do bound the perceptibility of the sticks and sinks attacks, just with different metrics. You can even argue that the number of sticks we generate is a measure of perceptibility. Compared to other papers in terms of visual perceptibility, our attacks are not that crazy. Of course, human perception is the true metric, and I think more work must be done on quantifying perceptibility in 3D. This paper is the first step, and I mainly wanted to show that there are factors other than perceptibility that we care about.


Super! Keep the great work UP


Off-topic, but this is the first time I’ve seen a colleague’s name show up in a random academic setting. I worked on a group project with Ronald Yu as an undergrad at USC. A smart and friendly guy.


Yep, Ronald is extremely nice and helpful!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: