Prompted by Zach & Xander — Field support for loading object bounding boxes detected by a deep-learning object detector (specifically, here, I’m using a slightly modified Darknet implementation of YOLO2). Right now this works like the sound analysis works: you send me data (here, video) and I’ll send you an analysis package that Field can load. Here’s the code:
That will give you a still from a video (and you ought to be able to figure out how to animate layer.time to get the video to play)
Now lets add the analysis data to it:
The new code here is pd.rectsAt(layer.time) which gives you a list of Rect that give you the position (in standard Stage 0-100 coordinates) of all of the found people. Rect is a structure with x, y, w and h.
This is fairly representative of state of the art a year or so ago — people come and go occasionally, and pairs of people can pose problems — and very representative of what the noise on raw tracking data of any kind looks like (from video detectors through pitch detectors in the sonic domain).
Sometimes knowing what rectangles are where at a given frame isn’t enough: if you want to structure your code around longer term patterns, you need to know how a particular rectangle moves over time. This isn’t information that the tracker provides directly (it merely looks frame-by-frame at the video), but it is information that we can have the computer try to infer. To get closer to this structure, two features:
pd.rectsAt(layer.time) returns a list of Rects but, in addition to x,y,w,h there’s also a name field. This name will be something like track57 or track102. Rects with the same name come from the same ‘track’ — a continuously tracked rectangle.
pd.tracksAtTime(layer.time) returns a list of Tracks that contains how a Rect moves over time range between track.startTime() and track.endTime(). You can get the (smoothly interpolated) Rect at time t by calling track.rectAtTime(t).
For example this code here, shows the current bounding boxes and their tracks:
Yielding the longer term structure of the person detector: