Methodology
We study calibrated stereoscopy as an affordable
and powerful perceptual modality which shows
a clear mass production potential
in traffic and transportation applications.
We address the fundamental problem
of detecting and localizing objects
of multiple classes at the same time.
We approach this problem by combining cues
arising from object appearance and
reconstructed structure of the scene.
In both of these two cases we leverage
the capacity of deep convolutional models
trained in the end-to-end fashion.
Overview of the results
Most of our research has been performed
in the frame of semantic segmentation
where we densely predict
the class membership of each image pixel.
We have
shown
that reconstructed structure can be exploited
to assemble a scale invariant image representation
which is independent of the object size in pixels.
In this setup, reconstruction aids recognition
by reducing intra-class variability
and improving the usage of the training data.
However, the resulting memory requirements
preclude implementing this idea on large resolutions.
We have therefore designed a
lightweight
architecture
which introduces ladder-style lateral connections
into a modeified DenseNet classifier.
The resulting model achieves 74.3 mIoU
on the Cityscapes dataset and enables
the forward pass on 2 MPixel images at 7.5 Hz.
To summarize, our main research results are as follows
(more details can be found
here):
Time frame
Start date: 1st October 2014.
Duration: 36 months
Funding
The project budget is 704 766,91 kn
which includes one postdoc salary
throughout the project duration.
The project has also been granted one
additional PhD position for 2+2 years.
The project has been fully funded by the
Croatian science foundation
under contract I-2433-2014.