About Nobile Engineering
Nobile-engineering comes from the effort of Andrea Nobile to bring expertise in the broad field of computer vision algorithms and embedded applications to demanding customers.
Andrea Nobile holds a Ph.D. in numerical physics and has a long research experince both in academia and industrial settings. He had critical roles in cutting edge projects like the QPACE project, IBM BlueGene/Q prototype and IBM in-memory-processing research architecture AMC.
Andrea worked for Huawei from 2015 until 2018 as Senior Engineer on a variety of projects involving ARM, deep neural networks, machine learning and innvoative HW architectures. Since 2018 he works as independent consultant (nobile-engineering) as computer vision and embedded development expert.
Andrea is an expert with over 15 years of experience in the implementation of high performance, low latency code in C/C++, ASM and intrinsics (x86, ARM, Power). He can work with accelerators of various kind having direct experience in accelerator design as HW architect. He is an expert about code parallelization using SIMD, multithreading (pthread, OPENMP) and network using MPI and other APIs. He is confident with low-level programming, directly interacting with the HW.
With great passion about computer vision and machine learning, in the context of Nobile-Engineering Andrea contributed to numerous projects in the fields of:
- Video analytics and survelliance
- Retail analytics
- Autonomous driving
- Defense
Contributions include:
- System Prototyping
- Neural network design, optmization and training
- Neural network inference and application runtime optimization
- Porting to C++
- Dataset creation, annotation, autoannotation, synthetic dataset creation
- Tooling for dataset creation/autoannotation and dataset improvent
- Design of specific algorithms for dedicated tasks
- Deployment of productionized applications
- Team mangament
Andrea has experience using Python and C++ with PyTorch, MXNET, Tensorflow, Caffe, scikit-learn, XGBOOST, and a variety of machine learning tools. He has experience of Deep Learning frameworks internals, porting of deep neural networks and application from Python to C++, porting to TensorRT (C++ and Python). As hobby he wrote a CNN runtime that uses a variety of tricks to speed-up execution of CNNs on CPU