The exploration of marine environments is crucial, yet the extreme conditions of the deep-sea, combined with the segregated signal processing in current sensor technologies, lead to bulky systems, high energy consumption, and significant latency, which severely constrains the development of real-time intelligent perception systems underwater. Herein, we developed a neuromorphic floating-gate transistor (NFT) that integrates both electrical and optical memory functionalities, emulating simultaneously visual and auditory synaptic behaviors within a single unit, thus enabling in-memory dual-mode processing of visual-auditory signals. Electrically, it achieves rapid switching (∼14 µs), high on/off ratio (106), and robust endurance (>104 cycles). This enables high-accuracy (88%) classification of seafloor minerals and rocks via sonar echo processing using a convolutional neural network (CNN). Optically, the NFT exhibits tunable synaptic weight modulation from short-term to long-term plasticity under 405-808 nm laser pulses. Leveraging the low-attenuation green-light window in seawater, the system, combined with RGB denoising and green-channel enhancement preprocessing, realizes 80% accuracy in marine biological image recognition. This synergistic electro-optical in-memory computing architecture provides an efficient, low-power, and compact hardware solution for intelligent perception in complex underwater environments.