Leverage Large Language Models for Complex Robot Manipulation
PI Researcher: Chenliang Xu
In robot manipulation, adapting to dynamic environments with flexible task specifications is challenging. Language-based vision manipulation systems offer a solution by linking language instructions to visual data and generating actions. However, current approaches often develop vision models and action policies separately, leading to poor integration. To address this, we propose ACTLLM, a method that unifies visual interpretation and policy learning using large language models (LLMs). By generating structured scene descriptions and incorporating an action consistency loss, ACTLLM is expected to enhance the fusion of visual and policy elements, facilitating the efficient execution of complex tasks within a multi-turn visual dialogue framework.
PI Researcher:
PI Researcher:
PI Researcher: 
PI Researcher: Mark Bocko
PI Researcher: and Kevin Parker
PI Researchers:
PI Researcher:
Multispectral Polarimetric Imaging of Nerves
Toward Speaker-Specific Voice Spoofing Countermeasures
PI Researcher:
PI Researcher:
IDEX Health and Science – Machine Learning for the Enhanced Design of Multilayer Optical Filters
PI Researcher:
PI Researcher: 
PI Researcher:
Parverio – Second Generation Computer Vision Tools for the Study of Luekocyte Trafficking Across Vascular Barriers In Vitro
PI Researcher:
Advanced Atomization Technologies – Establishing a Multi-modal Data Fusion Framework for Acquiring Tacit Knowledge from Machinists
PI Researcher:
Immersitech – Development of a Framework for the Evaluation of Spatial Audio System Performance
Causal modeling represents an active area of research for determining causal relationships when a randomized controlled experiment is not available. The RPI team has developed extensive experience building causal models using micro-level data. (left) and (right) will be worked with to apply methods to the context of answering business relevant questions on engagement outcomes in gaming that are also relevant for theory development. Questions may involve training, skill matching, in a skill-based collaborative / competitive game Knockout City. The analytics capabilities will be further incorporated as infrastructure for future games launched by Velan Studios and available to startups funded through Velan Ventures.
LightTopTech – Microscope design for Gabor-domain optical coherence microscopy of the brain and organoids powered by automated image processing and feature extraction
will be working withto develop a microscope for Gabor-Domain Optical Coherence Microscopy (GDOCM) and related image processing tools will be developed in this project, enabling commercialization of pre-clinical and clinical GD-OCM for brain tissue and organoid characterization.
Trendly – Few-Shot learning for Fine-Grained Object Recognition
worked with on issues with fine-grained object recognition. Prof. Luo will be exploring few-shot learning for fine-grained object recognition by utilizing contrastive learning to extract a discriminative representation of objects to facilitate learning from few examples. The concepts will be verified by a high-precision model for luxury bag authentication and generalized to other fine-grained object recognition tasks such as fine art, artifacts, and jewelry.
L3Harris – Scalable Architectures and Increased Capacity for Bi-stable Resistively-coupled Ising Machine – BRIM
IBM – Learning to Localize Sources of Network Diffusion
Flaum Eye Institute – 3D eye imaging and machine learning strategies to improve cataract surgery
Cataract surgery is the most often performed surgery in any hospital of the world (28 million/year). However, the process by which the intraocular lens to replace crystalline lens is selected relies on limited anatomical information and rudimentary formulas. worked with the Flaum Eye Institute at the URochester will attempt to propose the use of 3-D quantitative optical coherence tomography images and machine learning approaches to obtain an accurate expression of the estimated lens position based on the pre-operative anterior segment anatomy and full crystalline lens shape. This method will improve the refractive outcomes of cataract surgery, increasing patient satisfaction and reducing the burden of refractive error correction.


Pfizer – Using Neural Network and Genetic Algorithm to Optimize Laser Surface Functionalization for Biomedical Applications
Immersitech – Development of a Framework for the Evaluation of Spatial Audio System Performance
AMD – Architectural Support to Increase Ising Machine Capabilities
ACV Auctions – Auto Auction Data as a Leading Indicator of Economic Activity and Vehicle Valuation
IBM – Learning to Localize Sources of Network Diffusion