【国际版】海量的3D数据集帮助机器人更好的了解世界！__财经头条_

国智清创雄安机器人研究院

点击关注，查看相关内容

英文版正文4525字，

预计阅读时长12分钟

中文版正文1716字，

预计阅读时长8分钟

Image: PartNet

Project The PartNet dataset has over 26,671 3D models covering 24 object categories, with each object annotated with fine-grained 3D part information.

Project Partnet数据集有26671个三维模型，涵盖24个对象类别，每个对象都用细粒度的三维零件信息注释。

One of the things that makes humans so great at adapting to the world around us is our ability to understand entire categories of things all at once, and then use that general understanding to make sense of specific things that we’ve never seen before. For example, consider something like a lamp. We’ve all seen some lamps. Nobody has seen every single lamp there is. But in most cases, we can walk into someone’s house for the first time and easily identify all their lamps and how they work. Every once in a while, of course, there will be something incredibly weird that’ll cause you to have to ask, “Uh, is that a lamp? How do I turn it on?” But most of the time, our generalized mental model of lamps keeps us out of trouble.

人类可以很好地适应我们周围世界的一个因素是因为我们能够同时理解所有事物的全部类别，然后利用这种普遍的理解来理解我们以前从未见过的具体事物。例如，考虑一些类似于灯的东西。我们都见过一些灯。没有人见过那里的每一盏灯。但在大多数情况下，我们可以在第一次走进某人的房子时，便很容易地识别出他们所有的灯以及它们是如何工作的。当然，每隔一段时间，总会有一些不可思议的事情让你不得不问，“呃，那是台灯吗？我怎么打开它？“但大多数时候，我们对灯的普遍心理模型都会避免这种问题。

It’s helpful that lamps, along with other categories of objects, have (by definition) lots of pieces in common with each other. Lamps usually have bulbs in them. They often have shades. There’s probably also a base to keep it from falling over, a body to get it off the ground, and a power cord. If you see something with all of those characteristics, it’s probably a lamp, and once you know that, you can make educated guesses about how to usefully interact with it.

灯和其他种类的物体（根据定义）有很多共同点。灯里通常有灯泡。它们经常有阴影。可能还有一个底座可以防止它掉下来，主体可以让它离开地面，还有一根电源线。如果你看到所有具有这些特征的东西，它可能是一盏灯，一旦你知道了这一点，你就可以有根据地猜测如何有效地与之互动。

This level of understanding is something that robots tend to be particularly bad at, which is a real shame because of how useful it is. You might even argue that robots will have to understand objects on a level close to this if we’re ever going to trust them to operate autonomously in unstructured environments. At the 2019 Conference on Computer Vision and Pattern Recognition (CVPR) this week, a group of researchers from Stanford, UCSD, SFU, and Intel are announcing PartNet, a huge database of common 3D objects that are broken down and annotated at the level required to, they hope, teach a robot exactly what a lamp is.

这种实用的理解水平是机器人特别不擅长的，这是一个真正的缺陷。你甚至可能会争辩说，如果我们要相信机器人在非结构化环境中能够自主地工作，机器人就必须在接近这个水平的理解对象。在本周举行的2019年计算机视觉和模式识别（CVPR）会议上，斯坦福大学、加州大学旧金山分校（UCSD）、旧金山大学（SFU）和英特尔（Intel）的一组研究人员宣布了Partnet，这是一个庞大的通用3D物体数据库，该数据库按他们所希望的水平进行分解和注释，以准确地教会机器人什么是灯。

Image: PartNet

Project Example shapes with fine-grained part annotations for the 24 object categories in the PartNet dataset.

为PartNet数据集中的24个对象类别项目具有细粒度零件注释的示例形状。

PartNet is a subset of ShapeNet, an even huger 3D database of over 50,000 common objects. PartNet has 26,671 objects in its database across 24 categories (like doors, tables, chairs, lamps, microwaves, and clocks), and each one of those objects has been broken down into labeled component parts. Here’s what that looks like for two totally different looking lamps:

PartNet是ShapeNet的一个子集，它是一个包含50000多个公共对象的超庞大的3D数据库。PartNet数据库中有26671个对象，涉及24个类别（如门、桌子、椅子、灯、微波炉和时钟），每个对象都被分解为有标签的组件部件。这是两个完全不同的灯的外观：

Image: PartNet

PartNet features an expert-defined hierarchical template for each of its categories, like lamp (middle). This template includes different object types like a table lamp (left) and a ceiling lamp (right). The template is designed to be deep and comprehensive to cover structurally different types of lamps, with the same part concepts, such as a light bulb and lamp shade, are shared across the different types.

PartNet为每个类别都提供了一个由专家定义的层次模板，如灯（中间）。此模板包括不同的对象类型，如台灯（左）和顶灯（右）。该模板设计为深度和全面，以涵盖结构上不同类型的灯，具有相同的零件概念，如灯泡和灯罩，在不同类型之间共享。

All that semantically labeled detail is what makes PartNet special. Databases like ShapeNet basically just say “here are a bunch of things that are lamps,” which has limited usefulness. PartNet, by contrast, is a way to much more fundamentally understand lamps: What parts they’re made of, where controls tend to be, and so on. Beyond just helping with a much more generalized identification of previously unseen lamps, it also makes it possible for an autonomous system (with the proper training) to make inferences about how to interact with those unseen lamps in productive ways.

所有这些语义上标记的细节都使PartNet变得特别。像ShapeNet这样的数据库基本上只是说“这里有一堆东西是灯”，它的用处有限。相比之下，PartNet是一种从根本上理解灯的方法：它们是由什么部件组成的，在哪里控制，等等。除了帮助更广泛地识别以前看不见的灯之外，它还使自主系统（经过适当的培训）能够就如何以生产方式与这些看不见的灯交互做出推论。

As you might expect, creating PartNet was a stupendous amount of work. Nearly 70 “professional annotators” spent an average of 8 minutes annotating each and every one of those 26,671 3D shapes with a total of 573,585 parts, and then each annotation was verified at least once by another annotator. To keep things consistent, templates were created for each class of object, with the goal of minimizing the set of parts in a way that still comprehensively covered everything necessary to describe the entire object class. The parts are organized hierarchically, too, with small parts a subset of larger ones. Here’s how it all breaks down:

正如你所料，创建PartNet是一项巨大的工作。近70名“专业注释员”平均花费8分钟对26671个三维图形中的每一个进行注释，总共有573585个部分，然后每个注释至少由另一个注释员验证一次。为了保持一致性，为每类对象创建了模板，目标是以一种仍然全面覆盖描述整个对象类所需的所有内容的方式最小化部分集。这些部分也是按层次组织的，小部分是大部分的子集。以下是所有问题的分解方式：

In order for this to be useful outside of PartNet itself, robots will have to be able to do the 3D segmentation step on their own, taking 3D models of objects (that the robot creates) and then breaking them down into pieces that can be identified and correlated with the existing object models. This is a tricky thing to do for a bunch of reasons: For example, you need to be able to identify individual parts from point clouds that may be small but also important (like drawer pulls and door knobs), and many parts that look visually similar may be semantically quite different.

为了在PartNet本身之外发挥作用，机器人必须能够自己进行三维分割步骤，获取对象的三维模型（机器人创建的），然后将其分解为可识别并与现有对象模型关联的片段。这是一件很棘手的事情，原因有很多：例如，您需要能够从点云中识别出可能很小但也很重要的单个零件（如抽屉拉手和门把手），并且许多看起来相似的零件在语义上可能会大不相同。

The researchers have made some progress on this, but it’s still an area that needs more work. And that’s what PartNet is for, too—providing a dataset that can be used to develop better algorithms. At some point, PartNet may be part of a foundation for systems that can even annotate similar 3D models completely on their own, in the same way that we’ve seen autonomous driving datasets transition from human annotation to automatic annotation with human supervision. Bringing that level of semantic understanding to unfamiliar and unstructured environments will be key to those real-world adaptable robots that always seem to be right around the corner.

研究人员在这方面已经取得了一些进展，但这仍然是一个需要更多工作的领域。这就是PartNet的目的，它也提供了一个数据集，可以用来开发更好的算法。在某些时候，ParNet可以是系统的一部分，甚至可以自动地注释类似的3D模型，就像我们已经看到的自驱动数据集从人类注释到人类注释的自动注释。将这种水平的语义理解引入到不熟悉的和非结构化的环境中，对于那些似乎总是处于困境中的现实世界适应性机器人来说，这将是关键。

素材来源：

https://spectrum.ieee.org/automaton/robotics/artificial-intelligence/partnet-helps-robots-understand-what-things-are

相关阅读

1、【国际版】机器人将为2020东京奥运会带来更好的观赛体验

2、【国际版】2019美国机器人周开幕在即！在这里将汇聚全美最前沿的机器人！

3、【国际版】面对可怕的狮子鱼生物入侵，美国捕鱼机器人是这样保护自然生态的！

4、【国际版】澳洲无人机首次实现外卖配送，或将成为世界首例！

5、【国际版】2019全明星机器人专题第一期，全自动清理垃圾吸尘器机器人——irobot.

6、【国际版】2019全明星机器人专题第二期，未来家庭伙伴——Buddy

7、【国际版】2019全明星机器人专题第三期，世界首款睡眠机器人——Somnox

8、【国际版】2019全明星机器人专题第四期，全能型选手——Spotmini

9、【国际版】2019全明星机器人专题第五期，流汗机器人——Kengoro