[![Contributors][contributors-shield]][contributors-url] [![Forks][forks-shield]][forks-url] [![Stargazers][stars-shield]][stars-url] [![Issues][issues-shield]][issues-url]
Usage instructions: here
Table of Contents
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2025-07-23 | SRMambaV2: Biomimetic Attention for Sparse Point Cloud Upsampling in Autonomous Driving | Chuang Chen et.al. | 2507.17479 | null |
2025-07-21 | LINR-PCGC: Lossless Implicit Neural Representations for Point Cloud Geometry Compression | Wenjie Huang et.al. | 2507.15686 | null |
2025-07-02 | Robust Multi-generation Learned Compression of Point Cloud Attribute | Xiangzuo Liu et.al. | 2507.01320 | null |
2025-06-28 | Point Cloud Compression and Objective Quality Assessment: A Survey | Yiling Xu et.al. | 2506.22902 | null |
2025-06-27 | Low-Rank Implicit Neural Representation via Schatten-p Quasi-Norm and Jacobian Regularization | Zhengyun Cheng et.al. | 2506.22134 | null |
2025-06-11 | Generalized Gaussian Entropy Model for Point Cloud Attribute Compression with Dynamic Likelihood Intervals | Changhao Peng et.al. | 2506.09510 | null |
2025-05-26 | LPCM: Learning-based Predictive Coding for LiDAR Point Cloud Compression | Chang Sun et.al. | 2505.20059 | null |
2025-05-21 | A Novel Benchmark and Dataset for Efficient 3D Gaussian Splatting with Gaussian Point Cloud Compression | Kangli Wang et.al. | 2505.18197 | link |
2025-05-23 | Reflectance Prediction-based Knowledge Distillation for Robust 3D Object Detection in Compressed Point Clouds | Hao Jing et.al. | 2505.17442 | null |
2025-05-22 | SEDD-PCC: A Single Encoder-Dual Decoder Framework For End-To-End Learned Point Cloud Compression | Kai Hsiang Hsieh et.al. | 2505.16709 | null |
2025-05-19 | Denoising Diffusion Probabilistic Model for Point Cloud Compression at Low Bit-Rates | Gabriele Spadaro et.al. | 2505.13316 | link |
2025-05-15 | SRMamba: Mamba for Super-Resolution of LiDAR Point Clouds | Chuang Chen et.al. | 2505.10601 | null |
2025-04-20 | Efficient Implicit Neural Compression of Point Clouds via Learnable Activation in Latent Space | Yichi Zhang et.al. | 2504.14471 | null |
2025-04-19 | ROI-Guided Point Cloud Geometry Compression Towards Human and Machine Vision | Xie Liang et.al. | 2504.14240 | null |
2025-04-16 | Non-uniform Point Cloud Upsampling via Local Manifold Distribution | Yaohui Fang et.al. | 2504.11701 | null |
2025-04-08 | UVG-VPC: Voxelized Point Cloud Dataset for Visual Volumetric Video-based Coding | Guillaume Gautier et.al. | 2504.05888 | null |
2025-04-01 | Hierarchical Attention Networks for Lossless Point Cloud Attribute Compression | Yueru Chen et.al. | 2504.00481 | null |
2025-03-24 | UniPCGC: Towards Practical Point Cloud Geometry Compression via an Efficient Unified Approach | Kangli Wang et.al. | 2503.18541 | link |
2025-03-24 | Voxel-based Point Cloud Geometry Compression with Space-to-Channel Context | Bojun Liu et.al. | 2503.18283 | null |
2025-03-21 | High Efficiency Wiener Filter-based Point Cloud Quality Enhancement for MPEG G-PCC | Yuxuan Wei et.al. | 2503.17467 | null |
2025-06-16 | R2LDM: An Efficient 4D Radar Super-Resolution Framework Leveraging Diffusion Model | Boyuan Zheng et.al. | 2503.17097 | null |
2025-03-18 | RBFIM: Perceptual Quality Assessment for Compressed Point Clouds Using Radial Basis Function Interpolation | Zhang Chen et.al. | 2503.14154 | null |
2025-03-16 | RENO: Real-Time Neural Compression for 3D LiDAR Point Clouds | Kang You et.al. | 2503.12382 | link |
2025-02-26 | PCE-GAN: A Generative Adversarial Network for Point Cloud Attribute Quality Enhancement based on Optimal Transport | Tian Guo et.al. | 2503.00047 | null |
2025-02-26 | SPU-IMR: Self-supervised Arbitrary-scale Point Cloud Upsampling via Iterative Mask-recovery Network | Ziming Nie et.al. | 2502.19452 | link |
2025-02-25 | Deep-JGAC: End-to-End Deep Joint Geometry and Attribute Compression for Dense Colored Point Clouds | Yun Zhang et.al. | 2502.17939 | null |
2025-02-10 | Real-Time LiDAR Point Cloud Compression and Transmission for Resource-constrained Robots | Yuhao Cao et.al. | 2502.06123 | link |
2025-02-07 | DetVPCC: RoI-based Point Cloud Sequence Compression for 3D Object Detection | Mingxuan Yan et.al. | 2502.04804 | null |
2025-02-05 | Deep Learning-based Event Data Coding: A Joint Spatiotemporal and Polarity Solution | Abdelrahman Seleem et.al. | 2502.03285 | null |
2025-02-22 | Point Cloud Upsampling as Statistical Shape Model for Pelvic | Tongxu Zhang et.al. | 2501.16716 | null |
2025-01-25 | Efficient Point Clouds Upsampling via Flow Matching | Zhi-Song Liu et.al. | 2501.15286 | null |
2025-02-28 | Representation Learning of Point Cloud Upsampling in Global and Local Inputs | Tongxu Zhang et.al. | 2501.07076 | null |
2024-12-19 | Color Enhancement for V-PCC Compressed Point Cloud via 2D Attribute Map Optimization | Jingwei Bao et.al. | 2412.14449 | null |
2024-12-16 | EGP3D: Edge-guided Geometric Preserving 3D Point Cloud Super-resolution for RGB-D camera | Zheng Fang et.al. | 2412.11680 | null |
2024-12-11 | Implicit Neural Compression of Point Clouds | Hongning Ruan et.al. | 2412.10433 | null |
2024-12-07 | Rate-Distortion Optimized Skip Coding of Region Adaptive Hierarchical Transform Coefficients for MPEG G-PCC | Zehan Wang et.al. | 2412.05574 | null |
2025-01-09 | Rendering-Oriented 3D Point Cloud Attribute Compression using Sparse Tensor-based Transformer | Xiao Huo et.al. | 2411.07899 | null |
2024-11-09 | Linear Spherical Sliced Optimal Transport: A Fast Metric for Comparing Spherical Data | Xinran Liu et.al. | 2411.06055 | null |
2024-11-01 | PLATYPUS: Progressive Local Surface Estimator for Arbitrary-Scale Point Cloud Upsampling | Donghyun Kim et.al. | 2411.00432 | null |
2024-10-28 | Quality Analysis of the Coding Bitrate Tradeoff Between Geometry and Attributes for Colored Point Clouds | Joao Prazeres et.al. | 2410.21613 | null |
2024-10-09 | Point Cloud Compression with Bits-back Coding | Nguyen Quang Hieu et.al. | 2410.18115 | null |
2024-10-23 | Att2CPC: Attention-Guided Lossy Attribute Compression of Point Clouds | Kai Liu et.al. | 2410.17823 | link |
2024-10-22 | Joint Point Cloud Upsampling and Cleaning with Octree-based CNNs | Jihe Li et.al. | 2410.17001 | link |
2024-10-21 | MBPU: A Plug-and-Play State Space Model for Point Cloud Upsamping with Fast Point Rendering | Jiayi Song et.al. | 2410.15941 | null |
2024-10-13 | Towards Reproducible Learning-based Compression | Jiahao Pang et.al. | 2410.09872 | null |
2024-10-06 | Tensor-Train Point Cloud Compression and Efficient Approximate Nearest-Neighbor Search | Georgii Novikov et.al. | 2410.04462 | null |
2024-10-01 | Can We Remove the Ground? Obstacle-aware Point Cloud Compression for Remote Object Detection | Pengxi Zeng et.al. | 2410.00582 | null |
2024-09-19 | PVContext: Hybrid Context Model for Point Cloud Compression | Guoqing Zhang et.al. | 2409.12724 | null |
2024-09-12 | The JPEG Pleno Learning-based Point Cloud Coding Standard: Serving Man and Machine | André F. R. Guarda et.al. | 2409.08130 | null |
2024-09-08 | GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling | Huawei Sun et.al. | 2409.02720 | link |
2024-09-03 | GaussianPU: A Hybrid 2D-3D Upsampling Framework for Enhancing Color Point Clouds via 3D Gaussian Splatting | Zixuan Guo et.al. | 2409.01581 | null |
2024-08-20 | End-to-end learned Lossy Dynamic Point Cloud Attribute Compression | Dat Thanh Nguyen et.al. | 2408.10665 | null |
2024-08-20 | Diff-PCC: Diffusion-based Neural Compression for 3D Point Clouds | Kai Liu et.al. | 2408.10543 | null |
2024-08-16 | LLM-PCGC: Large Language Model-based Point Cloud Geometry Compression | Yuqi Ye et.al. | 2408.08682 | null |
2024-08-06 | Fast Point Cloud Geometry Compression with Context-based Residual Coding and INR-based Refinement | Hao Xu et.al. | 2408.02966 | link |
2024-08-01 | Learned Compression of Point Cloud Geometry and Attributes in a Single Model through Multimodal Rate-Control | Michael Rudolph et.al. | 2408.00599 | null |
2024-07-22 | Double Deep Learning-based Event Data Coding and Classification | Abdelrahman Seleem et.al. | 2407.15531 | null |
2024-07-11 | Enhancing octree-based context models for point cloud geometry compression with attention-based child node number prediction | Chang Sun et.al. | 2407.08528 | null |
2024-07-11 | Enhancing context models for point cloud geometry compression with context feature residuals and multi-loss | Chang Sun et.al. | 2407.08520 | null |
2024-07-19 | PCAC-GAN: A Sparse-Tensor-Based Generative Adversarial Network for 3D Point Cloud Attribute Compression | Xiaolong Mao et.al. | 2407.05677 | null |
2024-07-05 | Rethinking Data Input for Point Cloud Upsampling | Tongxu Zhang et.al. | 2407.04476 | null |
2024-08-26 | TSC-PCAC: Voxel Transformer and Sparse Convolution Based Point Cloud Attribute Compression for 3D Broadcasting | Zixi Guo et.al. | 2407.04284 | link |
2024-06-15 | Full reference point cloud quality assessment using support vector regression | Ryosuke Watanabe et.al. | 2406.10520 | link |
2024-09-25 | Bits-to-Photon: End-to-End Learned Scalable Point Cloud Compression for Direct Rendering | Yueyu Hu et.al. | 2406.05915 | null |
2024-06-02 | Towards Point Cloud Compression for Machine Perception: A Simple and Strong Baseline by Learning the Octree Depth Level Predictor | Lei Liu et.al. | 2406.00791 | null |
2024-05-23 | NeuroGauss4D-PCI: 4D Neural Fields and Gaussian Deformation Fields for Point Cloud Interpolation | Chaokang Jiang et.al. | 2405.14241 | link |
2024-05-19 | Point Cloud Compression with Implicit Neural Representations: A Unified Framework | Hongning Ruan et.al. | 2405.11493 | null |
2024-05-02 | PointCompress3D -- A Point Cloud Compression Framework for Roadside LiDARs in Intelligent Transportation Systems | Walter Zimmer et.al. | 2405.01750 | null |
2024-04-21 | Pointsoup: High-Performance and Extremely Low-Decoding-Latency Learned Geometry Codec for Large-Scale Point Cloud Scenes | Kang You et.al. | 2404.13550 | link |
2024-04-16 | Reward Learning from Suboptimal Demonstrations with Applications in Surgical Electrocautery | Zohre Karimi et.al. | 2404.07185 | null |
2024-04-10 | Efficient and Generic Point Model for Lossless Point Cloud Attribute Compression | Kang You et.al. | 2404.06936 | link |
2024-04-09 | Diffusion-Based Point Cloud Super-Resolution for mmWave Radar Data | Kai Luan et.al. | 2404.06012 | null |
2024-03-13 | Point Cloud Compression via Constrained Optimal Transport | Zezeng Li et.al. | 2403.08236 | link |
2024-03-08 | Arbitrary-Scale Point Cloud Upsampling by Voxel-Based Network with Latent Geometric-Consistent Learning | Hang Du et.al. | 2403.05117 | link |
2024-03-01 | Assessing objective quality metrics for JPEG and MPEG point cloud coding | Davi Lazzarotto et.al. | 2403.00410 | null |
2024-02-23 | Scalable Human-Machine Point Cloud Compression | Mateen Ulhaq et.al. | 2402.12532 | link |
2024-02-18 | 3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods | Till Beemelmanns et.al. | 2402.11680 | link |
2024-02-17 | Hierarchical Prior-based Super Resolution for Point Cloud Geometry Compression | Dingquan Li et.al. | 2402.11250 | link |
2024-02-11 | PIVOT-Net: Heterogeneous Point-Voxel-Tree-based Framework for Point Cloud Compression | Jiahao Pang et.al. | 2402.07243 | null |
2024-02-07 | Performance analysis of Deep Learning-based Lossy Point Cloud Geometry Compression Coding Solutions | Joao Prazeres et.al. | 2402.05192 | null |
2024-02-08 | Subjective performance evaluation of bitrate allocation strategies for MPEG and JPEG Pleno point cloud compression | Davi Lazzarotto et.al. | 2402.04760 | null |
2024-02-15 | LiDAR-Forest Dataset: LiDAR Point Cloud Simulation Dataset for Forestry Application | Yawen Lu et.al. | 2402.04546 | null |
2023-12-23 | Learning Continuous Implicit Field with Local Distance Indicator for Arbitrary-Scale Point Cloud Upsampling | Shujuan Li et.al. | 2312.15133 | null |
2024-03-13 | DiffPMAE: Diffusion Masked Autoencoders for Point Cloud Reconstruction | Yanlong Li et.al. | 2312.03298 | link |
2023-12-03 | A Conditional Denoising Diffusion Probabilistic Model for Point Cloud Upsampling | Wentao Qu et.al. | 2312.02719 | link |
2023-11-22 | Learned Nonlinear Predictor for Critically Sampled 3D Point Cloud Attribute Compression | Tam Thuc Do et.al. | 2311.13539 | null |
2023-11-22 | Volumetric 3D Point Cloud Attribute Compression: Learned polynomial bilateral filter for prediction | Tam Thuc Do et.al. | 2311.13533 | null |
2023-11-22 | Test-Time Augmentation for 3D Point Cloud Classification and Segmentation | Tuan-Anh Vu et.al. | 2311.13152 | null |
2023-11-03 | PDF: Point Diffusion Implicit Function for Large-scale Scene Neural Representation | Yuhan Ding et.al. | 2311.01773 | null |
2023-11-02 | Lightweight super resolution network for point cloud geometry compression | Wei Zhang et.al. | 2311.00970 | link |
2023-11-17 | Deep Learning-based Compressed Domain Multimedia for Man and Machine: A Taxonomy and Application to Point Cloud Classification | Abdelrahman Seleem et.al. | 2310.18849 | null |
2023-10-13 | iPUNet:Iterative Cross Field Guided Point Cloud Upsampling | Guangshun Wei et.al. | 2310.09092 | link |
2024-03-15 | PU-Ray: Domain-Independent Point Cloud Upsampling via Ray Marching on Neural Implicit Surface | Sangwon Lim et.al. | 2310.08755 | link |
2024-02-16 | Quasi-Monte Carlo for 3D Sliced Wasserstein | Khai Nguyen et.al. | 2309.11713 | link |
2023-09-08 | Poster: Making Edge-assisted LiDAR Perceptions Robust to Lossy Point Cloud Compression | Jin Heo et.al. | 2309.04549 | null |
2023-09-01 | Test-Time Adaptation for Point Cloud Upsampling Using Meta-Learning | Ahmed Hatem et.al. | 2308.16484 | null |
2024-02-08 | SCP: Spherical-Coordinate-based Learned Point Cloud Compression | Ao Luo et.al. | 2308.12535 | null |
2023-08-22 | Learning a More Continuous Zero Level Set in Unsigned Distance Fields through Level Set Projection | Junsheng Zhou et.al. | 2308.11441 | link |
2023-08-11 | Learned Point Cloud Compression for Classification | Mateen Ulhaq et.al. | 2308.05959 | link |
2023-07-27 | FLiCR: A Fast and Lightweight LiDAR Point Cloud Compression Based on Lossy RI | Jin Heo et.al. | 2307.15005 | null |
2023-07-20 | Aggressive saliency-aware point cloud compression | Eleftheria Psatha et.al. | 2307.10741 | null |
2023-07-18 | Arbitrary point cloud upsampling via Dual Back-Projection Network | Zhi-Song Liu et.al. | 2307.08992 | null |
2023-06-01 | 4DSR-GCN: 4D Video Point Cloud Upsampling using Graph Convolutional Networks | Lorenzo Berlincioni et.al. | 2306.01081 | null |
2023-05-16 | Learning Dynamic Point Cloud Compression via Hierarchical Inter-frame Block Matching | Shuting Xia et.al. | 2305.05356 | link |
2023-05-02 | Geometric Prior Based Deep Human Point Cloud Geometry Compression | Xinju Wu et.al. | 2305.01309 | null |
2023-05-02 | PU-EdgeFormer: Edge Transformer for Dense Prediction in Point Cloud Upsampling | Dohoon Kim et.al. | 2305.01148 | link |
2023-04-24 | Grad-PU: Arbitrary-Scale Point Cloud Upsampling via Gradient Descent with Learned Distance Functions | Yun He et.al. | 2304.11846 | link |
2023-04-01 | Volumetric Attribute Compression for 3D Point Clouds using Feedforward Network with Geometric Attention | Tam Thuc Do et.al. | 2304.00335 | null |
2023-03-27 | NeuralPCI: Spatio-temporal Neural Field for 3D Point Cloud Multi-frame Non-linear Interpolation | Zehan Zheng et.al. | 2303.15126 | link |
2023-11-07 | GQE-Net: A Graph-based Quality Enhancement Network for Point Cloud Color Attribute | Jinrui Xing et.al. | 2303.13764 | link |
2023-03-22 | Lossless Point Cloud Attribute Compression Using Cross-scale, Cross-group, and Cross-color Prediction | Jianqiang Wang et.al. | 2303.12917 | null |
2023-12-28 | Progressive Frame Patching for FoV-based Point Cloud Video Streaming | Tongyu Zong et.al. | 2303.08336 | null |
2023-12-03 | Parametric Surface Constrained Upsampler Network for Point Cloud | Pingping Cai et.al. | 2303.08240 | link |
2024-03-20 | Lossless Point Cloud Geometry and Attribute Compression Using a Learned Conditional Probability Model | Dat Thanh Nguyen et.al. | 2303.06519 | link |
2023-03-11 | Deep probabilistic model for lossless scalable point cloud attribute compression | Dat Thanh Nguyen et.al. | 2303.06517 | null |
2023-03-09 | BIRD-PCC: Bi-directional Range Image-based Deep LiDAR Point Cloud Compression | Chia-Sheng Liu et.al. | 2303.04027 | null |
2023-02-13 | gpcgc: a green point cloud geometry coding method | Qingyang Zhou et.al. | 2302.06062 | null |
2023-02-09 | BASICS: Broad quality Assessment of Static point clouds In Compression Scenarios | Ali Ak et.al. | 2302.04796 | null |
2023-04-27 | Linear Optimal Partial Transport Embedding | Yikun Bai et.al. | 2302.03232 | link |
2023-01-31 | Lidar Upsampling with Sliced Wasserstein Distance | Artem Savkin et.al. | 2301.13558 | null |
2023-01-28 | Dynamic Point Cloud Geometry Compression Using Multiscale Inter Conditional Coding | Jianqiang Wang et.al. | 2301.12165 | null |
2023-01-27 | Joint Geometry and Attribute Upsampling of Point Clouds Using Frequency-Selective Models with Overlapped Support | Viktoria Heimann et.al. | 2301.11630 | null |
2023-01-03 | Reduced Reference Quality Assessment for Point Cloud Compression | Yipeng Liu et.al. | 2301.01009 | null |
2023-04-06 | Neural Shape Compiler: A Unified Framework for Transforming between Text, Point Cloud, and Program | Tiange Luo et.al. | 2212.12952 | null |
2022-12-11 | Learning Neural Volumetric Field for Point Cloud Geometry Compression | Yueyu Hu et.al. | 2212.05589 | link |
2022-12-01 | Low-Rank Tensor Function Representation for Multi-Dimensional Data Recovery | Yisi Luo et.al. | 2212.00262 | null |
2023-12-09 | ECM-OPCC: Efficient Context Model for Octree-based Point Cloud Compression | Yiqi Jin et.al. | 2211.10916 | null |
2022-11-19 | Rate-Distortion Modeling for Bit Rate Constrained Point Cloud Compression | Pan Gao et.al. | 2211.10646 | null |
2022-10-21 | Motion Policy Networks | Adam Fishman et.al. | 2210.12209 | link |
2022-10-28 | Motion estimation and filtered prediction for dynamic point cloud attribute compression | Haoran Hong et.al. | 2210.08262 | null |
2022-10-08 | Point Cloud Upsampling via Cascaded Refinement Network | Hang Du et.al. | 2210.03942 | link |
2023-02-14 | Multiscale Latent-Guided Entropy Model for LiDAR Point Cloud Compression | Tingyu Fan et.al. | 2209.12512 | null |
2022-09-17 | CARNet:Compression Artifact Reduction for Point Cloud Attribute | Dandan Ding et.al. | 2209.08276 | null |
2022-11-16 | CU-Net: Real-Time High-Fidelity Color Upsampling for Point Clouds | Lingdong Wang et.al. | 2209.06112 | link |
2022-09-09 | GRASP-Net: Geometric Residual Analysis and Synthesis for Point Cloud Compression | Jiahao Pang et.al. | 2209.04401 | link |
2022-09-06 | Learning to Predict on Octree for Scalable Point Cloud Geometry Coding | Yixiang Mao et.al. | 2209.02226 | null |
2022-08-26 | Efficient LiDAR Point Cloud Geometry Compression Through Neighborhood Point Attention | Ruixiang Xue et.al. | 2208.12573 | null |
2022-08-17 | Efficient dynamic point cloud coding using Slice-Wise Segmentation | Faranak Tohidi et.al. | 2208.08061 | null |
2023-01-10 | Arbitrary Point Cloud Upsampling with Spherical Mixture of Gaussians | Anthony Dell'Eva et.al. | 2208.05274 | link |
2022-08-04 | IT/IST/IPLeiria Response to the Call for Proposals on JPEG Pleno Point Cloud Coding | André F. R. Guarda et.al. | 2208.02716 | null |
2022-08-04 | IPDAE: Improved Patch-Based Deep Autoencoder for Lossy Point Cloud Geometry Compression | Kang You et.al. | 2208.02519 | link |
2022-07-25 | Inter-Frame Compression for Dynamic Point Cloud Geometry Coding | Anique Akhtar et.al. | 2207.12554 | null |
2022-07-20 | GIPSO: Geometrically Informed Propagation for Online Adaptation in 3D LiDAR Segmentation | Cristiano Saltori et.al. | 2207.09763 | link |
2022-06-25 | BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling | Yechao Bai et.al. | 2206.12648 | null |
2022-06-24 | Rate-Distortion Optimal Transform Coefficient Selection for Unoccupied Regions in Video-Based Point Cloud Compression | Christian Herglotz et.al. | 2206.12186 | null |
2022-05-24 | A Rate Control Algorithm for Video-based Point Cloud Compression | Fangyu Shen et.al. | 2205.11825 | null |
2022-05-19 | A Comparative Study of Feature Expansion Unit for 3D Point Cloud Upsampling | Qiang Li et.al. | 2205.09594 | null |
2022-05-02 | D-DPCC: Deep Dynamic Point Cloud Compression via 3D Motion Prediction | Tingyu Fan et.al. | 2205.01135 | link |
2022-05-02 | Point Cloud Compression with Sibling Context and Surface Priors | Zhili Chen et.al. | 2205.00760 | link |
2022-04-29 | Deep Geometry Post-Processing for Decompressed Point Clouds | Xiaoqing Fan et.al. | 2204.13952 | link |
2022-04-27 | Density-preserving Deep Point Cloud Compression | Yun He et.al. | 2204.12684 | null |
2022-04-25 | 4DAC: Learning Attribute Compression for Dynamic Point Clouds | Guangchi Fang et.al. | 2204.11723 | null |
2022-04-25 | Dynamic Point Cloud Compression with Cross-Sectional Approach | Faranak Tohidi et.al. | 2204.11409 | null |
2022-04-22 | PU-EVA: An Edge Vector based Approximation Solution for Flexible-scale Point Cloud Upsampling | Luqing Luo et.al. | 2204.10750 | null |
2022-04-18 | Self-Supervised Arbitrary-Scale Point Clouds Upsampling via Implicit Neural Representation | Wenbo Zhao et.al. | 2204.08196 | link |
2022-06-22 | Learning-based Lossless Point Cloud Geometry Coding using Sparse Tensors | Dat Thanh Nguyen et.al. | 2204.05043 | null |
2022-04-03 | Sparse Tensor-based Point Cloud Attribute Compression | Jianqiang Wang et.al. | 2204.01023 | link |
2022-03-22 | IDEA-Net: Dynamic 3D Point Cloud Interpolation via Deep Embedding Alignment | Yiming Zeng et.al. | 2203.11590 | link |
2022-03-21 | Upsampling Autoencoder for Self-Supervised Point Cloud Learning | Cheng Zhang et.al. | 2203.10768 | null |
2022-05-03 | Frequency-Selective Mesh-to-Mesh Resampling for Color Upsampling of Point Clouds | Viktoria Heimann et.al. | 2203.09224 | null |
2022-03-02 | PUFA-GAN: A Frequency-Aware Generative Adversarial Network for 3D Point Cloud Upsampling | Hao Liu et.al. | 2203.00914 | null |
2022-05-16 | Variable Rate Compression for Raw 3D Point Clouds | Md Ahmed Al Muzaddid et.al. | 2202.13862 | link |
2022-09-14 | Point cloud completion via structured feature maps using a feedback network | Zejia Su et.al. | 2202.08583 | null |
2022-05-08 | OctAttention: Octree-Based Large-Scale Contexts Model for Point Cloud Compression | Chunyang Fu et.al. | 2202.06028 | link |
2022-02-01 | Point Cloud Compression for Efficient Data Broadcasting: A Performance Comparison | Francesco Nardo et.al. | 2202.00719 | null |
2022-02-01 | Fractional Motion Estimation for Point Cloud Compression | Haoran Hong et.al. | 2202.00172 | null |
2022-01-17 | SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations | Zhenyu Li et.al. | 2112.04680 | link |
2022-03-31 | Neural Points: Point Cloud Representation with Neural Fields for Arbitrary Upsampling | Wanquan Feng et.al. | 2112.04148 | link |
2022-03-01 | Attribute Artifacts Removal for Geometry-based Point Cloud Compression | Xihua Sheng et.al. | 2112.00560 | null |
2022-10-03 | PU-Transformer: Point Cloud Upsampling Transformer | Shi Qiu et.al. | 2111.12242 | link |
2022-10-21 | Sparse Tensor-based Multiscale Representation for Point Cloud Geometry Compression | Jianqiang Wang et.al. | 2111.10633 | link |
2021-10-18 | Patch-Based Deep Autoencoder for Point Cloud Geometry Compression | Kang You et.al. | 2110.09109 | link |
2022-07-12 | PC |
Chen Long et.al. | 2109.09337 | link |
2021-09-16 | R-PCC: A Baseline for Range Image-based Point Cloud Compression | Sukai Wang et.al. | 2109.07717 | link |
2021-09-15 | Which One is Better: Assessing Objective Metrics for Point Cloud Compression | Yipeng Liu et.al. | 2109.07158 | null |
2021-08-05 | Joint Geometry and Color Projection-based Point Cloud Quality Metric | Alireza Javaheri et.al. | 2108.02481 | link |
2021-08-03 | SSPU-Net: Self-Supervised Point Cloud Upsampling via Differentiable Rendering | Yifan Zhao et.al. | 2108.00454 | link |
2021-07-29 | Video-based Point Cloud Compression Artifact Removal | Anique Akhtar et.al. | 2107.14179 | null |
2024-02-28 | Score-Based Point Cloud Denoising | Shitong Luo et.al. | 2107.10981 | link |
2022-06-08 | PU-Flow: a Point Cloud Upsampling Network with Normalizing Flows | Aihua Mao et.al. | 2107.05893 | link |
2022-04-18 | "Zero-Shot" Point Cloud Upsampling | Kaiyue Zhou et.al. | 2106.13765 | link |
2021-06-23 | Lossless Point Cloud Attribute Compression with Normal-based Intra Prediction | Qian Yin et.al. | 2106.12236 | null |
2021-06-21 | Cylindrical coordinates for LiDAR point cloud compression | Shashank N. Sridhara et.al. | 2106.11237 | null |
2021-10-11 | Neural Network Modeling of Probabilities for Coding the Octree Representation of Point Clouds | Emre Can Kaya et.al. | 2106.06482 | link |
2021-06-09 | Point Cloud Upsampling via Disentangled Refinement | Ruihui Li et.al. | 2106.04779 | link |
2021-06-02 | DeepCompress: Efficient Point Cloud Geometry Compression | Ryan Killea et.al. | 2106.01504 | link |
2021-06-01 | RAI-Net: Range-Adaptive LiDAR Point Cloud Frame Interpolation Network | Lili Zhao et.al. | 2106.00496 | null |
2021-05-28 | An Unsupervised Optical Flow Estimation For LiDAR Image Sequences | Xuezhou Guo et.al. | 2105.13879 | null |
2021-05-05 | VoxelContext-Net: An Octree based Framework for Point Cloud Compression | Zizheng Que et.al. | 2105.02158 | null |
2021-04-20 | Multiscale deep context modeling for lossless point cloud geometry compression | Dat Thanh Nguyen et.al. | 2104.09859 | link |
2021-04-12 | Towards Efficient Graph Convolutional Networks for Point Cloud Handling | Yawei Li et.al. | 2104.05706 | null |
2021-03-11 | Advanced Geometry Surface Coding for Dynamic Point Cloud Compression | Jian Xiong et.al. | 2103.06549 | null |
2021-03-05 | Hybrid Point Cloud Semantic Compression for Automotive Sensors: A Performance Evaluation | Andrea Varischio et.al. | 2103.03819 | null |
2021-02-26 | Point Cloud Upsampling and Normal Estimation using Deep Learning for Robust Surface Reconstruction | Rajat Sharma et.al. | 2102.13391 | link |
2021-02-25 | A deep perceptual metric for 3D point clouds | Maurice Quach et.al. | 2102.12839 | link |
2021-02-08 | Meta-PU: An Arbitrary-Scale Upsampling Network for Point Cloud | Shuquan Ye et.al. | 2102.04317 | null |
2020-12-15 | NeuralQAAD: An Efficient Differentiable Framework for High Resolution Point Cloud Compression | Nicolas Wagner et.al. | 2012.08143 | null |
2022-06-11 | SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine Reconstruction with Self-Projection Optimization | Xinhai Liu et.al. | 2012.04439 | link |
2021-11-18 | Vehicular Cooperative Perception Through Action Branching and Federated Reinforcement Learning | Mohamed K. Abdel-Aziz et.al. | 2012.03414 | null |
2020-12-05 | ParaNet: Deep Regular Representation for 3D Point Clouds | Qijian Zhang et.al. | 2012.03028 | null |
2020-11-27 | Spherical Interpolated Convolutional Network with Distance-Feature Density for 3D Semantic Segmentation of Point Clouds | Guangming Wang et.al. | 2011.13784 | null |
2020-11-25 | Reduced Reference Perceptual Quality Model and Application to Rate Control for 3D Point Cloud Compression | Qi Liu et.al. | 2011.12688 | null |
2020-11-07 | Multiscale Point Cloud Geometry Compression | Jianqiang Wang et.al. | 2011.03799 | link |
2020-10-29 | Point Cloud Attribute Compression via Successive Subspace Graph Transform | Yueru Chen et.al. | 2010.15302 | null |
2020-08-16 | Real-Time Spatio-Temporal LiDAR Point Cloud Compression | Yu Feng et.al. | 2008.06972 | link |
2021-08-03 | Subjective Quality Database and Objective Study of Compressed Point Clouds With 6DoF Head-Mounted Display | Xinju Wu et.al. | 2008.02501 | null |
2020-06-20 | Pseudo-LiDAR Point Cloud Interpolation Based on 3D Motion Representation and Spatial Supervision | Haojie Liu et.al. | 2006.11481 | null |
2020-06-24 | Improved Deep Point Cloud Geometry Compression | Maurice Quach et.al. | 2006.09043 | link |
2020-04-03 | Intrinsic Point Cloud Interpolation via Dual Latent Space Navigation | Marie-Julie Rakotosaona et.al. | 2004.01661 | link |
2020-03-30 | A generalized Hausdorff distance based quality metric for point cloud geometry | Alireza Javaheri et.al. | 2003.13669 | null |
2020-03-30 | Optimizing Geometry Compression using Quantum Annealing | Sebastian Feld et.al. | 2003.13253 | null |
2020-03-27 | Model-based Joint Bit Allocation between Geometry and Color for Video-based 3D Point Cloud Compression | Qi Liu et.al. | 2002.10798 | null |
2020-03-07 | PUGeo-Net: A Geometry-centric Network for 3D Point Cloud Upsampling | Yue Qian et.al. | 2002.10277 | null |
2020-06-22 | Folding-based compression of point cloud attributes | Maurice Quach et.al. | 2002.04439 | null |
2020-01-13 | Efficient 3D Road Map Data Exchange for Intelligent Vehicles in Vehicular Fog Networks | Ivan Wang-Hei Ho et.al. | 2001.04057 | null |
2020-01-12 | Linear Model based Geometry Coding for Lidar Acquired Point Clouds | Xiang Zhang et.al. | 2001.03871 | null |
2021-04-09 | PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection | Shaoshuai Shi et.al. | 1912.13192 | link |
2019-12-20 | A Comprehensive Study and Comparison of Core Technologies for MPEG 3D Point Cloud Compression | Hao Liu et.al. | 1912.09674 | null |
2020-10-15 | Point Cloud Rendering after Coding: Impacts on Subjective and Objective Quality | Alireza Javaheri et.al. | 1912.09137 | null |
2021-03-29 | PU-GCN: Point Cloud Upsampling using Graph Convolutional Networks | Guocheng Qian et.al. | 1912.03264 | link |
2019-11-04 | Video-based compression for plenoptic point clouds | Li Li et.al. | 1911.01355 | null |
2019-09-26 | Learned Point Cloud Geometry Compression | Jianqiang Wang et.al. | 1909.12037 | link |
2019-09-16 | PLIN: A Network for Pseudo-LiDAR Point Cloud Interpolation | Haojie Liu et.al. | 1909.07137 | null |
2019-08-17 | 3D Point Cloud Super-Resolution via Graph Total Variation on Surface Normals | Chinthaka Dinesh et.al. | 1908.06261 | null |
2019-08-06 | Point Cloud Super Resolution with Adversarial Residual Graph Networks | Huikai Wu et.al. | 1908.02111 | link |
2020-08-10 | Predictive Generalized Graph Fourier Transform for Attribute Compression of Dynamic Point Clouds | Yiqun Xu et.al. | 1908.01970 | null |
2019-07-25 | PU-GAN: a Point Cloud Upsampling Adversarial Network | Ruihui Li et.al. | 1907.10844 | null |
2019-06-27 | A Convolutional Decoder for Point Clouds using Adaptive Instance Normalization | Isaak Lim et.al. | 1906.11478 | null |
2019-04-18 | Deep AutoEncoder-based Lossy Geometry Compression for Point Clouds | Wei Yan et.al. | 1905.03691 | null |
2019-05-22 | Learning Convolutional Transforms for Lossy Point Cloud Geometry Compression | Maurice Quach et.al. | 1903.08548 | link |
2019-09-30 | Variational Graph Methods for Efficient Point Cloud Sparsification | Daniel Tenbrinck et.al. | 1903.02858 | null |
2019-03-05 | Pose Estimation of Vehicles Over Uneven Terrain | Yingchong Ma et.al. | 1903.02052 | null |
2019-02-11 | Occupancy-map-based rate distortion optimization for video-based point cloud compression | Li Li et.al. | 1902.04169 | null |
2018-09-30 | A Volumetric Approach to Point Cloud Compression | Maja Krivokuća et.al. | 1810.00484 | null |
2018-05-29 | Surface Light Field Compression using a Point Cloud Codec | Xiang Zhang et.al. | 1805.11203 | null |
2018-05-23 | Comments on "Compression of 3D Point Clouds Using a Region-Adaptive Hierarchical Transform" | Gustavo Sandri et.al. | 1805.09146 | null |
2018-04-28 | Hybrid Point Cloud Attribute Compression Using Slice-based Layered Structure and Block-based Intra Prediction | Yiting Shao et.al. | 1804.10783 | null |
2018-03-26 | PU-Net: Point Cloud Upsampling Network | Lequan Yu et.al. | 1801.06761 | link |
2017-10-10 | Attribute Compression of 3D Point Clouds Using Laplacian Sparsity Optimized Graph Transform | Yiting Shao et.al. | 1710.03532 | null |
2017-03-08 | Dynamic Polygon Clouds: Representation and Compression for VR/AR | Philip A. Chou et.al. | 1610.00402 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2025-07-23 | Large Learning Rates Simultaneously Achieve Robustness to Spurious Correlations and Compressibility | Melih Barsbey et.al. | 2507.17748 | null |
2025-07-23 | Dataset Distillation as Data Compression: A Rate-Utility Perspective | Youneng Bao et.al. | 2507.17221 | null |
2025-07-22 | DASPack: Controlled Data Compression for Distributed Acoustic Sensing | Aleix Segui et.al. | 2507.16390 | null |
2025-07-22 | LongSplat: Online Generalizable 3D Gaussian Splatting from Long Sequence Images | Guichen Huang et.al. | 2507.16144 | null |
2025-07-21 | Compress-Align-Detect: onboard change detection from unregistered images | Gabriele Inzerillo et.al. | 2507.15578 | null |
2025-07-21 | Conditional Video Generation for High-Efficiency Video Compression | Fangqiu Yi et.al. | 2507.15269 | null |
2025-07-20 | Rate-Distortion-Perception Trade-off with Strong Realism Constraints: Role of Side Information and Common Randomness | Yassine Hamdi et.al. | 2507.14825 | null |
2025-07-19 | Adaptive 3D Gaussian Splatting Video Streaming | Han Gong et.al. | 2507.14432 | null |
2025-07-22 | IPPRO: Importance-based Pruning with PRojective Offset for Magnitude-indifferent Structural Pruning | Jaeheun Jung et.al. | 2507.14171 | null |
2025-07-18 | Pass-efficient Randomized Algorithms for Low-rank Approximation of Quaternion Matrices | Salman Ahmadi-Asl et.al. | 2507.13731 | null |
2025-07-16 | Online Training and Pruning of Deep Reinforcement Learning Networks | Valentin Frank Ingmar Guenter et.al. | 2507.11975 | null |
2025-07-16 | CompressedVQA-HDR: Generalized Full-reference and No-reference Quality Assessment Models for Compressed High Dynamic Range Videos | Wei Sun et.al. | 2507.11900 | null |
2025-07-15 | Boosting Scientific Error-Bounded Lossy Compression through Optimized Synergistic Lossy-Lossless Orchestration | Shixun Wu et.al. | 2507.11165 | null |
2025-07-16 | TRAN-D: 2D Gaussian Splatting-based Sparse-view Transparent Object Depth Reconstruction via Physics Simulation for Scene Update | Jeongyun Kim et.al. | 2507.11069 | null |
2025-07-13 | A Survey on Medical Image Compression: From Traditional to Learning-Based | Guofeng Tong et.al. | 2507.10615 | null |
2025-07-12 | Geometric Generative Modeling with Noise-Conditioned Graph Networks | Peter Pao-Huang et.al. | 2507.09391 | null |
2025-07-11 | T-GVC: Trajectory-Guided Generative Video Coding at Ultra-Low Bitrates | Zhitao Wang et.al. | 2507.07633 | null |
2025-07-09 | Assessing Learned Models for Phase-only Hologram Compression | Zicong Peng et.al. | 2507.06646 | null |
2025-07-08 | Optimization of Fractal Image Compression | Nastaran Pourshab et.al. | 2507.06325 | null |
2025-07-08 | Error Exponents for Quantum Packing Problems via An Operator Layer Cake Theorem | Hao-Chung Cheng et.al. | 2507.06232 | null |
2025-07-08 | GSVR: 2D Gaussian-based Video Representation for 800+ FPS with Hybrid Deformation Field | Zhizhuo Pang et.al. | 2507.05594 | null |
2025-06-29 | Closed-Form Robustness Bounds for Second-Order Pruning of Neural Controller Policies | Maksym Shamrai et.al. | 2507.02953 | null |
2025-07-02 | Generative Latent Diffusion for Efficient Spatiotemporal Data Reduction | Xiao Li et.al. | 2507.02129 | null |
2025-07-02 | Perception-Oriented Latent Coding for High-Performance Compressed Domain Semantic Inference | Xu Zhang et.al. | 2507.01608 | null |
2025-07-01 | LotteryCodec: Searching the Implicit Representation in a Random Network for Low-Complexity Image Compression | Haotian Wu et.al. | 2507.01204 | null |
2025-07-01 | Tensor network methods for the Gross-Pitaevskii equation on fine grids | Ryan J. J. Connor et.al. | 2507.01149 | null |
2025-07-03 | Customizable ROI-Based Deep Image Compression | Jian Jin et.al. | 2507.00373 | null |
2025-06-30 | How to Design and Train Your Implicit Neural Representation for Video Compression | Matthew Gwilliam et.al. | 2506.24127 | null |
2025-06-29 | SIEDD: Shared-Implicit Encoder with Discrete Decoders | Vikram Rangarajan et.al. | 2506.23382 | null |
2025-06-28 | A Novel Adaptive Low-Rank Matrix Approximation Method for Image Compression and Reconstruction | Weiwei Xu et.al. | 2506.22713 | null |
2025-06-27 | dreaMLearning: Data Compression Assisted Machine Learning | Xiaobo Zhao et.al. | 2506.22190 | null |
2025-06-27 | StableCodec: Taming One-Step Diffusion for Extreme Image Compression | Tianyu Zhang et.al. | 2506.21977 | null |
2025-06-27 | End-to-End RGB-IR Joint Image Compression With Channel-wise Cross-modality Entropy Model | Haofeng Wang et.al. | 2506.21851 | null |
2025-06-26 | Succinct Preferential Attachment Graphs | Ziad Ismaili Alaoui et.al. | 2506.21436 | null |
2025-06-26 | Uncover Treasures in DCT: Advancing JPEG Quality Enhancement by Exploiting Latent Correlations | Jing Yang et.al. | 2506.21171 | null |
2025-06-26 | Linearity-based neural network compression | Silas Dobler et.al. | 2506.21146 | null |
2025-06-24 | Video Compression for Spatiotemporal Earth System Data | Oscar J. Pellicer-Valero et.al. | 2506.19656 | null |
2025-06-24 | Explicit Residual-Based Scalable Image Coding for Humans and Machines | Yui Tatsumi et.al. | 2506.19297 | null |
2025-06-23 | NIC-RobustBench: A Comprehensive Open-Source Toolkit for Neural Image Compression and Robustness Analysis | Georgii Bychkov et.al. | 2506.19051 | null |
2025-06-23 | On the computation of tensor functions under tensor-tensor multiplications with linear maps | Jeong-Hoon Ju et.al. | 2506.18713 | null |
2025-06-25 | LVPNet: A Latent-variable-based Prediction-driven End-to-end Framework for Lossless Compression of Medical Images | Chenyue Song et.al. | 2506.17983 | null |
2025-06-19 | DiffO: Single-step Diffusion for Image Compression at Ultra-Low Bitrates | Chanung Park et.al. | 2506.16572 | link |
2025-06-19 | DT-UFC: Universal Large Model Feature Coding via Peaky-to-Balanced Distribution Transformation | Changsheng Gao et.al. | 2506.16495 | null |
2025-06-19 | Watermarking Autoregressive Image Generation | Nikola Jovanović et.al. | 2506.16349 | link |
2025-06-19 | Data Compression with Relative Entropy Coding | Gergely Flamich et.al. | 2506.16309 | null |
2025-06-19 | Fast Training-free Perceptual Image Compression | Ziran Zhu et.al. | 2506.16102 | null |
2025-06-19 | Information-computation trade-offs in non-linear transforms | Connor Ding et.al. | 2506.15948 | null |
2025-06-18 | GratNet: A Photorealistic Neural Shader for Diffractive Surfaces | Narayan Kandel et.al. | 2506.15815 | null |
2025-06-20 | Show-o2: Improved Native Unified Multimodal Models | Jinheng Xie et.al. | 2506.15564 | link |
2025-06-18 | MSNeRV: Neural Video Representation with Multi-Scale Feature Fusion | Jun Zhu et.al. | 2506.15276 | null |
2025-06-18 | ABC: Adaptive BayesNet Structure Learning for Computational Scalable Multi-task Image Compression | Yufeng Zhang et.al. | 2506.15228 | link |
2025-06-24 | Privacy-Shielded Image Compression: Defending Against Exploitation from Vision-Language Pretrained Models | Xuelin Shen et.al. | 2506.15201 | link |
2025-06-17 | Breaking the Multi-Enhancement Bottleneck: Domain-Consistent Quality Enhancement for Compressed Images | Qunliang Xing et.al. | 2506.14152 | null |
2025-06-16 | Dark Energy Survey Year 3 results: |
J. Prat et.al. | 2506.13439 | null |
2025-06-16 | Audio-Visual Driven Compression for Low-Bitrate Talking Head Videos | Riku Takahashi et.al. | 2506.13419 | null |
2025-06-15 | Zero-shot denoising via neural compression: Theoretical and algorithmic framework | Ali Zafari et.al. | 2506.12693 | link |
2025-06-13 | Applications of Combinatorics on Words with Symbolic Dynamics | Duaa Abdullah et.al. | 2506.12150 | null |
2025-06-13 | Scheduling Agile Earth Observation Satellites with Onboard Processing and Real-Time Monitoring | Antonio M. Mercado-Martínez et.al. | 2506.11556 | null |
2025-06-12 | Rethinking Generative Human Video Coding with Implicit Motion Transformation | Bolin Chen et.al. | 2506.10453 | null |
2025-06-11 | GPU Acceleration of SQL Analytics on Compressed Data | Zezhou Huang et.al. | 2506.10092 | null |
2025-06-11 | Generalized Gaussian Entropy Model for Point Cloud Attribute Compression with Dynamic Likelihood Intervals | Changhao Peng et.al. | 2506.09510 | null |
2025-06-11 | ScaleLSD: Scalable Deep Line Segment Detection Streamlined | Zeran Ke et.al. | 2506.09369 | link |
2025-06-10 | Optimizing Learned Image Compression on Scalar and Entropy-Constraint Quantization | Florian Borzechowski et.al. | 2506.08662 | null |
2025-06-10 | AstroCompress: A benchmark dataset for multi-purpose compression of astronomical data | Tuan Truong et.al. | 2506.08306 | link |
2025-06-09 | Fine-Grained Motion Compression and Selective Temporal Fusion for Neural B-Frame Video Coding | Xihua Sheng et.al. | 2506.07709 | null |
2025-06-09 | Generative Models at the Frontier of Compression: A Survey on Generative Face Video Coding | Bolin Chen et.al. | 2506.07369 | null |
2025-06-07 | Towards AI-Native Fronthaul: Neural Compression for NextG Cloud RAN | Chenghong Bian et.al. | 2506.06925 | null |
2025-06-07 | Employing Discrete Fourier Transform in Representational Learning | Raoof HojatJalali et.al. | 2506.06765 | null |
2025-06-13 | The JPEG XL Image Coding System: History, Features, Coding Tools, Design Rationale, and Future | Jon Sneyers et.al. | 2506.05987 | null |
2025-06-05 | Kernel |
Thore Gerlach et.al. | 2506.04786 | null |
2025-06-05 | A Fast, Accurate and Oscillation-free Spectral Collocation Solver for High-dimensional Transport Problems | Nicola Cavallini et.al. | 2506.04732 | null |
2025-06-03 | MUC-G4: Minimal Unsat Core-Guided Incremental Verification for Deep Neural Network Compression | Jingyang Li et.al. | 2506.04268 | null |
2025-06-03 | Approximate Borderline Sampling using Granular-Ball for Classification Tasks | Qin Xie et.al. | 2506.02366 | null |
2025-06-11 | Fourier-Modulated Implicit Neural Representation for Multispectral Satellite Image Compression | Woojin Cho et.al. | 2506.01234 | null |
2025-06-02 | Structured Pruning and Quantization for Learned Image Compression | Md Adnan Faisal Hossain et.al. | 2506.01229 | link |
2025-06-02 | Flexible Mixed Precision Quantization for Learned Image Compression | Md Adnan Faisal Hossain et.al. | 2506.01221 | link |
2025-06-09 | ViVo: A Dataset for Volumetric Video Reconstruction and Compression | Adrian Azzarelli et.al. | 2506.00558 | null |
2025-05-31 | Your Demands Deserve More Bits: Referring Semantic Image Compression at Ultra-low Bitrate | Chenhao Wu et.al. | 2506.00526 | null |
2025-05-30 | SiLVR: A Simple Language-based Video Reasoning Framework | Ce Zhang et.al. | 2505.24869 | link |
2025-05-30 | Likelihoods for Stochastic Gravitational Wave Background Data Analysis | Gabriele Franciolini et.al. | 2505.24695 | null |
2025-05-29 | Semantics-Guided Generative Image Compression | Cheng-Lin Wu et.al. | 2505.24015 | link |
2025-05-29 | Position Dependent Prediction Combination For Intra-Frame Video Coding | Amir Said et.al. | 2505.23672 | null |
2025-05-29 | Low-Complexity Transform Adjustments For Video Coding | Amir Said et.al. | 2505.23618 | null |
2025-05-29 | Quasi-Periodic Optical Key-Enabled Hybrid Cryptography: Merging Diffractive Physics and Deep Learning for High-Dimensional Security | Haiqi Gao et.al. | 2505.23479 | null |
2025-05-28 | 3DGS Compression with Sparsity-guided Hierarchical Transform Coding | Hao Xu et.al. | 2505.22908 | null |
2025-05-28 | Interpretable Scaling Behavior in Sparse Subnetwork Representations of Quantum States | Brandon Barton et.al. | 2505.22734 | null |
2025-05-28 | Synonymous Variational Inference for Perceptual Image Compression | Zijian Liang et.al. | 2505.22438 | link |
2025-05-27 | Highly Efficient Non-Separable Transforms for Next Generation Video Coding | Amir Said et.al. | 2505.21728 | null |
2025-05-28 | HoliTom: Holistic Token Merging for Fast Video Large Language Models | Kele Shao et.al. | 2505.21334 | link |
2025-05-27 | Transfer learning for multifidelity simulation-based inference in cosmology | Alex A. Saoulis et.al. | 2505.21215 | null |
2025-05-27 | Generative Image Compression by Estimating Gradients of the Rate-variable Feature Distribution | Minghao Han et.al. | 2505.20984 | null |
2025-05-27 | QwT-v2: Practical, Effective and Efficient Post-Training Quantization | Ningyuan Tang et.al. | 2505.20932 | null |
2025-05-27 | Stationary MMD Points for Cubature | Zonghao Chen et.al. | 2505.20754 | link |
2025-05-26 | MiniLongBench: The Low-cost Long Context Understanding Benchmark for Large Language Models | Zhongzhan Huang et.al. | 2505.19959 | link |
2025-05-28 | A Comprehensive Real-World Assessment of Audio Watermarking Algorithms: Will They Survive Neural Codecs? | Yigitcan Özer et.al. | 2505.19663 | link |
2025-05-26 | Information-theoretic Generalization Analysis for VQ-VAEs: A Role of Latent Variables | Futoshi Futami et.al. | 2505.19470 | null |
2025-05-26 | TokBench: Evaluating Your Visual Tokenizer before Visual Generation | Junfeng Wu et.al. | 2505.18142 | null |
2025-05-23 | Accelerating Learned Image Compression Through Modeling Neural Training Dynamics | Yichi Zhang et.al. | 2505.18107 | null |
2025-05-23 | Efficient compression of neural networks and datasets | Lukas Silvester Barth et.al. | 2505.17469 | link |
2025-05-29 | Low-Rank Adaptation of Pre-trained Vision Backbones for Energy-Efficient Image Coding for Machine | Yichi Zhang et.al. | 2505.17366 | link |
2025-05-22 | One-Step Diffusion-Based Image Compression with Semantic Distillation | Naifu Xue et.al. | 2505.16687 | null |
2025-05-22 | Is Quantum Optimization Ready? An Effort Towards Neural Network Compression using Adiabatic Quantum Computing | Zhehui Wanga et.al. | 2505.16332 | null |
2025-05-22 | Learning novel representations of variable sources from multi-modal |
P. Huijse et.al. | 2505.16320 | null |
2025-05-22 | TacCompress: A Benchmark for Multi-Point Tactile Data Compression in Dexterous Manipulation | Yang Li et.al. | 2505.16289 | null |
2025-05-22 | Generative Latent Coding for Ultra-Low Bitrate Image and Video Compression | Linfeng Qi et.al. | 2505.16177 | null |
2025-05-22 | Compressing Human Body Video with Interactive Semantics: A Generative Approach | Bolin Chen et.al. | 2505.16152 | null |
2025-05-24 | OSCAR: One-Step Diffusion Codec Across Multiple Bit-rates | Jinpei Guo et.al. | 2505.16091 | link |
2025-05-21 | Combining Progressive Image Compression and Random Access in DNA Data Storage | Xavier Pic et.al. | 2505.15632 | null |
2025-05-21 | Robo-DM: Data Management For Large Robot Datasets | Kaiyuan Chen et.al. | 2505.15558 | null |
2025-05-21 | Certified Neural Approximations of Nonlinear Dynamics | Frederik Baymler Mathiesen et.al. | 2505.15497 | link |
2025-05-21 | Joint Optimization of Primary and Secondary Transforms Using Rate-Distortion Optimized Transform Design | Darukeesan Pakiyarajah et.al. | 2505.15104 | null |
2025-05-20 | Neural Video Compression with Context Modulation | Chuanbo Tang et.al. | 2505.14541 | null |
2025-05-20 | Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models | Xuyang Liu et.al. | 2505.14454 | link |
2025-05-19 | GANCompress: GAN-Enhanced Neural Image Compression with Binary Spherical Quantization | Karthik Sivakoti et.al. | 2505.13542 | null |
2025-05-21 | DD-Ranking: Rethinking the Evaluation of Dataset Distillation | Zekai Li et.al. | 2505.13300 | link |
2025-05-19 | Higher fidelity perceptual image and video compression with a latent conditioned residual denoising diffusion model | Jonas Brenig et.al. | 2505.13152 | link |
2025-05-19 | Towards a Universal Image Degradation Model via Content-Degradation Disentanglement | Wenbo Yang et.al. | 2505.12860 | null |
2025-05-16 | Is Compression Really Linear with Code Intelligence? | Xianzhen Luo et.al. | 2505.11441 | null |
2025-05-15 | Generalization of Repetitiveness Measures for Two-Dimensional Strings | Lorenzo Carfagna et.al. | 2505.10680 | null |
2025-05-15 | MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning | Ke Wang et.al. | 2505.10557 | link |
2025-05-15 | LAV: Audio-Driven Dynamic Visual Generation with Neural Compression and StyleGAN2 | Jongmin Jung et.al. | 2505.10101 | null |
2025-05-15 | High Quality Underwater Image Compression with Adaptive Correction and Codebook-based Augmentation | Yimin Zhou et.al. | 2505.09986 | null |
2025-05-14 | Efficient LiDAR Reflectance Compression via Scanning Serialization | Jiahao Zhu et.al. | 2505.09433 | null |
2025-05-14 | Neural Video Compression using 2D Gaussian Splatting | Lakshya Gupta et.al. | 2505.09324 | null |
2025-05-15 | BiECVC: Gated Diversification of Bidirectional Contexts for Learned Video Compression | Wei Jiang et.al. | 2505.09193 | link |
2025-05-13 | Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion | Anle Ke et.al. | 2505.08281 | link |
2025-05-13 | Clustering-based Low-Rank Matrix Approximation: An Adaptive Theoretical Analysis with Application to Data Compression | Sisipho Hamlomo et.al. | 2505.08256 | null |
2025-05-15 | Towards Scalable IoT Deployment for Visual Anomaly Detection via Efficient Compression | Arianna Stropeni et.al. | 2505.07119 | null |
2025-05-10 | ReplayCAD: Generative Diffusion Replay for Continual Anomaly Detection | Lei Hu et.al. | 2505.06603 | link |
2025-05-10 | TierBase: A Workload-Driven Cost-Optimized Key-Value Store | Zhitao Shen et.al. | 2505.06556 | null |
2025-05-10 | Event-based Neural Spike Detection Using Spiking Neural Networks for Neuromorphic iBMI Systems | Chanwook Hwang et.al. | 2505.06544 | null |
2025-05-09 | Strong converse Exponents of Partially Smoothed Information Measures | Mario Berta et.al. | 2505.06050 | null |
2025-05-09 | Towards Facial Image Compression with Consistency Preserving Diffusion Prior | Yimin Zhou et.al. | 2505.05870 | null |
2025-05-09 | PICD: Versatile Perceptual Image Compression with Diffusion Rendering | Tongda Xu et.al. | 2505.05853 | null |
2025-05-08 | Augmented Deep Contexts for Spatially Embedded Video Coding | Yifan Bian et.al. | 2505.05309 | link |
2025-05-07 | Steerable Scene Generation with Post Training and Inference-Time Search | Nicholas Pfaff et.al. | 2505.04831 | link |
2025-05-06 | 3D Gaussian Splatting Data Compression with Mixture of Priors | Lei Liu et.al. | 2505.03310 | null |
2025-05-05 | Gone With the Bits: Revealing Racial Bias in Low-Rate Neural Compression for Facial Images | Tian Qiu et.al. | 2505.02949 | null |
2025-05-05 | A Rate-Quality Model for Learned Video Coding | Sang NguyenQuang et.al. | 2505.02720 | null |
2025-05-05 | Data Compression for Time Series Modelling: A Case Study of Smart Grid Demand Forecasting | Mikkel Bue Lykkegaard et.al. | 2505.02606 | null |
2025-05-04 | Robust Localization, Mapping, and Navigation for Quadruped Robots | Dyuman Aditya et.al. | 2505.02272 | null |
2025-05-03 | HybridGS: High-Efficiency Gaussian Splatting Data Compression using Dual-Channel Sparse Representation and Point Cloud Encoder | Qi Yang et.al. | 2505.01938 | link |
2025-05-14 | Easz: An Agile Transformer-based Image Compression Framework for Resource-constrained IoTs | Yu Mao et.al. | 2505.01742 | null |
2025-05-06 | SemSpaceFL: A Collaborative Hierarchical Federated Learning Framework for Semantic Communication in 6G LEO Satellites | Loc X. Nguyen et.al. | 2505.00966 | null |
2025-05-01 | Compressing fluid flows with nonlinear machine learning: mode decomposition, latent modeling, and flow control | Koji Fukagata et.al. | 2505.00343 | link |
2025-05-01 | Efficient Neural Video Representation with Temporally Coherent Modulation | Seungjun Shin et.al. | 2505.00335 | null |
2025-04-30 | SR-NeRV: Improving Embedding Efficiency of Neural Video Representation via Super-Resolution | Taiga Hayami et.al. | 2505.00046 | null |
2025-04-29 | Convolutional Autoencoders for Data Compression and Anomaly Detection in Small Satellite Technologies | Dishanand Jayeprokash et.al. | 2505.00040 | null |
2025-04-30 | Why Compress What You Can Generate? When GPT-4o Generation Ushers in Image Compression Fields | Yixin Gao et.al. | 2504.21814 | null |
2025-04-30 | LoC-LIC: Low Complexity Learned Image Coding Using Hierarchical Feature Transforms | Ayman A. Ameen et.al. | 2504.21778 | null |
2025-05-04 | Distributed Online Randomized Gradient-Free optimization with Compressed Communication | Longkang Zhu et.al. | 2504.21693 | null |
2025-05-10 | MagicPortrait: Temporally Consistent Face Reenactment with 3D Geometric Guidance | Mengting Wei et.al. | 2504.21497 | link |
2025-05-08 | Invariant Bridges Between Four Successive Points: A New Tool for Data Coding | Stanislav Semenov et.al. | 2504.21473 | null |
2025-04-30 | Emerging Advances in Learned Video Compression: Models, Systems and Beyond | Chuanmin Jia et.al. | 2504.21445 | null |
2025-04-29 | Neural Stereo Video Compression with Hybrid Disparity Compensation | Shiyin Jiang et.al. | 2504.20383 | null |
2025-04-25 | Low-Rank Matrix Approximation for Neural Network Compression | Kalyan Cherukuri et.al. | 2504.20078 | null |
2025-04-28 | Enhancing Quality for VVC Compressed Videos with Omniscient Quality Enhancement Model | Xiem HoangVan et.al. | 2504.19935 | null |
2025-04-28 | Universally Wheeler Languages | Ruben Becker et.al. | 2504.19537 | null |
2025-04-27 | MLICv2: Enhanced Multi-Reference Entropy Modeling for Learned Image Compression | Wei Jiang et.al. | 2504.19119 | null |
2025-04-26 | Revisiting Transformers through the Lens of Low Entropy and Dynamic Sparsity | Ruifeng Ren et.al. | 2504.18929 | null |
2025-04-30 | 4DGS-CC: A Contextual Coding Framework for 4D Gaussian Splatting Data Compression | Zicong Chen et.al. | 2504.18925 | null |
2025-04-25 | MROP: Modulated Rank-One Projections for compressive radio interferometric imaging | Olivier Leblanc et.al. | 2504.18446 | null |
2025-04-25 | Partition Map-Based Fast Block Partitioning for VVC Inter Coding | Xinmin Feng et.al. | 2504.18398 | link |
2025-04-24 | Quantum Autoencoder for Multivariate Time Series Anomaly Detection | Kilian Tscharke et.al. | 2504.17548 | null |
2025-04-24 | Range Image-Based Implicit Neural Compression for LiDAR Point Clouds | Akihiro Kuwabara et.al. | 2504.17229 | null |
2025-04-22 | TVC: Tokenized Video Compression with Ultra-Low Bitrate | Lebin Zhou et.al. | 2504.16953 | null |
2025-04-23 | Learning Switchable Priors for Neural Image Compression | Haotian Zhang et.al. | 2504.16586 | null |
2025-04-24 | FPGA-Based Neural Network Accelerators for Space Applications: A Survey | Pedro Antunes et.al. | 2504.16173 | null |
2025-04-22 | Low-Rank Adaptation of Neural Fields | Anh Truong et.al. | 2504.15933 | null |
2025-04-21 | Plug-and-Play Versatile Compressed Video Enhancement | Huimin Zeng et.al. | 2504.15380 | null |
2025-04-23 | Physics-Aware Compression of Plasma Distribution Functions with GPU-Accelerated Gaussian Mixture Models | Andong Hu et.al. | 2504.14897 | null |
2025-04-20 | Efficient Implicit Neural Compression of Point Clouds via Learnable Activation in Latent Space | Yichi Zhang et.al. | 2504.14471 | null |
2025-04-19 | Text-Audio-Visual-conditioned Diffusion Model for Video Saliency Prediction | Li Yu et.al. | 2504.14267 | null |
2025-04-18 | Statistical Analysis and End-to-End Performance Evaluation of Traffic Models for Automotive Data | Marcello Bullo et.al. | 2504.14017 | null |
2025-04-17 | Multiscale Tensor Summation Factorization as a New Neural Network Layer (MTS Layer) for Multidimensional Data Processing | Mehmet Yamaç et.al. | 2504.13975 | link |
2025-04-18 | LimitNet: Progressive, Content-Aware Image Offloading for Extremely Weak Devices & Networks | Ali Hojjat et.al. | 2504.13736 | link |
2025-04-21 | SolarZip: An Efficient and Adaptive Compression Framework for Solar EUV Imaging Data | Zedong Liu et.al. | 2504.13504 | null |
2025-04-12 | Universal Representations for Classification-enhanced Lossy Compression | Nam Nguyen et.al. | 2504.13191 | null |
2025-04-17 | All-in-One Transferring Image Compression from Human Perception to Multi-Machine Perception | Jiancheng Zhao et.al. | 2504.12997 | null |
2025-04-17 | Efficient Masked Image Compression with Position-Indexed Self-Attention | Chengjie Dai et.al. | 2504.12923 | null |
2025-04-16 | CodingHomo: Bootstrapping Deep Homography With Video Coding | Yike Liu et.al. | 2504.12165 | link |
2025-04-15 | Randomized block proximal method with locally Lipschitz continuous gradient | Pedro Pérez-Aros et.al. | 2504.11410 | link |
2025-04-15 | DeepSelective: Feature Gating and Representation Matching for Interpretable Clinical Prediction | Ruochi Zhang et.al. | 2504.11264 | null |
2025-04-15 | Guided Wave-Based Structural Awareness Under Varying Operating States via Manifold Representations | Yiming Fan et.al. | 2504.11235 | null |
2025-04-14 | VAE-based Feature Disentanglement for Data Augmentation and Compression in Generalized GNSS Interference Classification | Lucas Heublein et.al. | 2504.10556 | null |
2025-04-10 | Data Compression for Fast Online Stochastic Optimization | Irina Wang et.al. | 2504.08097 | link |
2025-04-09 | LVC: A Lightweight Compression Framework for Enhancing VLMs in Long Video Understanding | Ziyi Wang et.al. | 2504.06835 | null |
2025-04-17 | FANeRV: Frequency Separation and Augmentation based Neural Representation for Video | Li Yu et.al. | 2504.06755 | null |
2025-04-10 | Subjective Visual Quality Assessment for High-Fidelity Learning-Based Image Compression | Mohsen Jenadeleh et.al. | 2504.06301 | link |
2025-04-08 | Old and New Results on Alphabetic Codes | Roberto Bruno et.al. | 2504.05959 | null |
2025-04-07 | InteractVLM: 3D Interaction Reasoning from 2D Foundational Models | Sai Kumar Dwivedi et.al. | 2504.05303 | link |
2025-04-07 | One-Minute Video Generation with Test-Time Training | Karan Dalal et.al. | 2504.05298 | null |
2025-04-07 | Randomized block Krylov method for approximation of truncated tensor SVD | Malihe Nobakht Kooshkghazi et.al. | 2504.04989 | null |
2025-04-07 | 3DM-WeConvene: Learned Image Compression with 3D Multi-Level Wavelet-Domain Convolution and Entropy Model | Haisheng Fu et.al. | 2504.04658 | link |
2025-04-04 | Three Forensic Cues for JPEG AI Images | Sandra Bergmann et.al. | 2504.03191 | null |
2025-04-03 | F-ViTA: Foundation Model Guided Visible to Thermal Translation | Jay N. Paranjape et.al. | 2504.02801 | link |
2025-04-03 | PicoPose: Progressive Pixel-to-Pixel Correspondence Learning for Novel Object Pose Estimation | Lihua Liu et.al. | 2504.02617 | link |
2025-04-03 | Bridging the Gap between Gaussian Diffusion Models and Universal Quantization for Image Compression | Lucas Relic et.al. | 2504.02579 | null |
2025-04-03 | L-LBVC: Long-Term Motion Estimation and Prediction for Learned Bi-Directional Video Compression | Yongqi Zhai et.al. | 2504.02560 | null |
2025-04-03 | Image Coding for Machines via Feature-Preserving Rate-Distortion Optimization | Samuel Fernández-Menduiña et.al. | 2504.02216 | null |
2025-04-02 | Representing Flow Fields with Divergence-Free Kernels for Reconstruction | Xingyu Ni et.al. | 2504.01913 | null |
2025-04-02 | SELIC: Semantic-Enhanced Learned Image Compression via High-Level Textual Guidance | Haisheng Fu et.al. | 2504.01279 | null |
2025-04-01 | Fundamentals of Caching Layered Data objects | Agrim Bari et.al. | 2504.01104 | null |
2025-04-01 | Green computing toward SKA era with RICK | Giovanni Lacopo et.al. | 2504.00959 | null |
2025-04-01 | Learned Image Compression with Dictionary-based Entropy Model | Jingbo Lu et.al. | 2504.00496 | link |
2025-04-01 | Learned Image Compression and Restoration for Digital Pathology | SeonYeong Lee et.al. | 2503.23862 | link |
2025-03-31 | An extrapolated and provably convergent algorithm for nonlinear matrix decomposition with the ReLU function | Nicolas Gillis et.al. | 2503.23832 | link |
2025-03-31 | TransVFC: A Transformable Video Feature Compression Framework for Machines | Yuxiao Sun et.al. | 2503.23772 | link |
2025-03-30 | Enhancing 3D Gaussian Splatting Compression via Spatial Condition-based Prediction | Jingui Ma et.al. | 2503.23337 | null |
2025-03-29 | Improved Motion Plane Adaptive 360-Degree Video Compression Using Affine Motion Models | Marina Ritthaler et.al. | 2503.23151 | null |
2025-03-29 | ShiftLIC: Lightweight Learned Image Compression with Spatial-Channel Shift Operations | Youneng Bao et.al. | 2503.23052 | link |
2025-03-27 | DeCompress: Denoising via Neural Compression | Ali Zafari et.al. | 2503.22015 | null |
2025-03-27 | Nonlinear Multiple Response Regression and Learning of Latent Spaces | Ye Tian et.al. | 2503.21608 | null |
2025-03-27 | F-INR: Functional Tensor Decomposition for Implicit Neural Representations | Sai Karthikeya Vemuri et.al. | 2503.21507 | null |
2025-03-27 | Embedding Compression Distortion in Video Coding for Machines | Yuxiao Sun et.al. | 2503.21469 | link |
2025-03-28 | Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression | Hanyue Tu et.al. | 2503.21284 | link |
2025-03-27 | WVSC: Wireless Video Semantic Communication with Multi-frame Compensation | Bingyan Xie et.al. | 2503.21197 | null |
2025-03-26 | Bandwidth Allocation for Cloud-Augmented Autonomous Driving | Peter Schafhalter et.al. | 2503.20127 | null |
2025-03-25 | Bitstream Collisions in Neural Image Compression via Adversarial Perturbations | Jordan Madden et.al. | 2503.19817 | link |
2025-03-25 | GIViC: Generative Implicit Video Compression | Ge Gao et.al. | 2503.19604 | null |
2025-03-25 | TFIC: End-to-End Text-Focused Image Compression for Coding for Machines | Stefano Della Fiore et.al. | 2503.19495 | null |
2025-03-25 | Multiscale Feature Importance-based Bit Allocation for End-to-End Feature Coding for Machines | Junle Liu et.al. | 2503.19278 | null |
2025-03-24 | Rank-Based Modeling for Universal Packets Compression in Multi-Modal Communications | Xuanhao Luo et.al. | 2503.19097 | link |
2025-03-24 | Merge Mode for Template-based Intra Mode Derivation (TIMD) in ECM | Mohsen Abdoli et.al. | 2503.18679 | null |
2025-03-24 | GranQ: Granular Zero-Shot Quantization with Unified Layer-Channel Awareness | Inpyo Hong et.al. | 2503.18339 | null |
2025-03-23 | Compression benchmarking of holotomography data using the OME-Zarr storage format | Dohyeon Lee et.al. | 2503.18037 | null |
2025-03-23 | Guided Diffusion for the Extension of Machine Vision to Human Visual Perception | Takahiro Shindo et.al. | 2503.17907 | null |
2025-03-21 | Optimal Neural Compressors for the Rate-Distortion-Perception Tradeoff | Eric Lei et.al. | 2503.17558 | null |
2025-03-20 | Samplets: Wavelet concepts for scattered data | Helmut Harbrecht et.al. | 2503.17487 | null |
2025-03-20 | Overview of Variable Rate Coding in JPEG AI | Panqi Jia et.al. | 2503.16288 | null |
2025-03-20 | PromptMobile: Efficient Promptus for Low Bandwidth Mobile Video Streaming | Liming Liu et.al. | 2503.16112 | null |
2025-03-19 | Fast Two-photon Microscopy by Neuroimaging with Oblong Random Acquisition (NORA) | Esther Whang et.al. | 2503.15487 | null |
2025-03-17 | Highly Efficient Direct Analytics on Semantic-aware Time Series Data Compression | Guoyou Sun et.al. | 2503.13246 | null |
2025-03-17 | OLÉ -- Online Learning Emulation in Cosmology | Sven Günther et.al. | 2503.13183 | link |
2025-03-17 | OSLO-IC: On-the-Sphere Learned Omnidirectional Image Compression with Attention Modules and Spatial Context | Paul Wawerek-López et.al. | 2503.13119 | null |
2025-03-17 | Knowledge Distillation: Enhancing Neural Network Compression with Integrated Gradients | David E. Hernandez et.al. | 2503.13008 | null |
2025-03-17 | Task-Oriented Feature Compression for Multimodal Understanding via Device-Edge Co-Inference | Cheng Yuan et.al. | 2503.12926 | null |
2025-03-20 | MambaIC: State Space Models for High-Performance Learned Image Compression | Fanhu Zeng et.al. | 2503.12461 | link |
2025-03-16 | A Parametric Family of Polynomial Wavelets for Signal and Image Processing | Mariantonia Cotronei et.al. | 2503.12403 | null |
2025-03-16 | RENO: Real-Time Neural Compression for 3D LiDAR Point Clouds | Kang You et.al. | 2503.12382 | link |
2025-03-14 | Gain-MLP: Improving HDR Gain Map Encoding via a Lightweight MLP | Trevor D. Canham et.al. | 2503.11883 | null |
2025-03-14 | Pathology Image Compression with Pre-trained Autoencoders | Srikar Yellapragada et.al. | 2503.11591 | null |
2025-03-14 | Enhanced Diagnostic Fidelity in Pathology Whole Slide Image Compression via Deep Learning | Maximilian Fischer et.al. | 2503.11350 | null |
2025-03-14 | FG-DFPN: Flow Guided Deformable Frame Prediction Network | M. Akın Yılmaz et.al. | 2503.11343 | link |
2025-03-14 | Leveraging Diffusion Knowledge for Generative Image Compression with Fractal Frequency-Aware Band Learning | Lingyu Zhu et.al. | 2503.11321 | null |
2025-03-14 | Deep Lossless Image Compression via Masked Sampling and Coarse-to-Fine Auto-Regression | Tiantian Li et.al. | 2503.11231 | null |
2025-03-14 | Stabilizing Quantization-Aware Training by Implicit-Regularization on Hessian Matrix | Junbiao Pang et.al. | 2503.11159 | null |
2025-03-14 | Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization | Kyle Sargent et.al. | 2503.11056 | null |
2025-03-13 | JPEG Compliant Compression for Both Human and Machine, A Report | Linfeng Ye et.al. | 2503.10912 | null |
2025-03-13 | Edge-Fog Computing-Enabled EEG Data Compression via Asymmetrical Variational Discrete Cosine Transform Network | Xin Zhu et.al. | 2503.09961 | null |
2025-03-12 | Bidirectional Learned Facial Animation Codec for Low Bitrate Talking Head Videos | Riku Takahashi et.al. | 2503.09787 | null |
2025-03-12 | PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image Modeling | Nikolai Körber et.al. | 2503.09368 | link |
2025-03-11 | Residual Learning and Filtering Networks for End-to-End Lossless Video Compression | Md baharul Islam et.al. | 2503.08819 | null |
2025-03-11 | Using Powerful Prior Knowledge of Diffusion Model in Deep Unfolding Networks for Image Compressive Sensing | Chen Liao et.al. | 2503.08429 | link |
2025-03-11 | Explainable Autoencoder Design for RSSI-Based Multi-User Beam Probing and Hybrid Precoding | Asmaa Abdallah et.al. | 2503.08267 | null |
2025-03-10 | Inverting Parameterized Burrows-Wheeler Transform | Shogen Kawanami et.al. | 2503.06970 | null |
2025-03-10 | Rate distortion dimension and ergodic decomposition for |
Masaki Tsukamoto et.al. | 2503.06851 | null |
2025-03-09 | Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform | Chenyu Huang et.al. | 2503.06676 | null |
2025-03-12 | FEDS: Feature and Entropy-Based Distillation Strategy for Efficient Learned Image Compression | Haisheng Fu et.al. | 2503.06399 | null |
2025-03-07 | Cross-Layer-Optimized Link Selection for Hologram Video Streaming over Millimeter Wave Networks | Yiming Jiang et.al. | 2503.05195 | null |
2025-03-06 | Adapt3R: Adaptive 3D Scene Representation for Domain Transfer in Imitation Learning | Albert Wilcox et.al. | 2503.04877 | null |
2025-03-13 | Security and Real-time FPGA integration for Learned Image Compression | Alaa Mazouz et.al. | 2503.04867 | null |
2025-03-13 | Lightweight Embedded FPGA Deployment of Learned Image Compression with Knowledge Distillation and Hybrid Quantization | Alaa Mazouz et.al. | 2503.04832 | null |
2025-03-06 | PokéChamp: an Expert-level Minimax Language Agent | Seth Karten et.al. | 2503.04094 | null |
2025-03-06 | Significant challenges for astrophysical inference with next-generation gravitational-wave observatories | A. Makai Baker et.al. | 2503.04073 | null |
2025-03-11 | OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction | Huang Huang et.al. | 2503.03734 | null |
2025-03-08 | Rethinking Video Tokenization: A Conditioned Diffusion-based Approach | Nianzu Yang et.al. | 2503.03708 | link |
2025-03-04 | UAR-NVC: A Unified AutoRegressive Framework for Memory-Efficient Neural Video Compression | Jia Wang et.al. | 2503.02733 | null |
2025-03-04 | On the sensitivity of CDAWG-grammars | Hiroto Fujimaru et.al. | 2503.02415 | null |
2025-03-03 | How Low Can You Go? Searching for the Intrinsic Dimensionality of Complex Networks using Metric Node Embeddings | Nikolaos Nakis et.al. | 2503.01723 | link |
2025-03-03 | Lossy Neural Compression for Geospatial Analytics: A Review | Carlos Gomes et.al. | 2503.01505 | null |
2025-03-07 | DLF: Extreme Image Compression with Dual-generative Latent Fusion | Naifu Xue et.al. | 2503.01428 | null |
2025-03-03 | Improving the Efficiency of VVC using Partitioning of Reference Frames | Kamran Qureshi et.al. | 2503.01415 | null |
2025-03-03 | Multi-resolution Encoding for HTTP Adaptive Streaming using VVenC | Kamran Qureshi et.al. | 2503.01404 | null |
2025-03-01 | High Dynamic Range Video Compression: A Large-Scale Benchmark Dataset and A Learned Bit-depth Scalable Compression Algorithm | Zhaoyi Tian et.al. | 2503.00410 | link |
2025-03-01 | Taming Large Multimodal Agents for Ultra-low Bitrate Semantically Disentangled Image Compression | Juan Song et.al. | 2503.00399 | null |
2025-03-07 | CAT-3DGS: A Context-Adaptive Triplane Approach to Rate-Distortion-Optimized 3DGS Compression | Yu-Ting Zhan et.al. | 2503.00357 | null |
2025-02-28 | Towards Lossless Implicit Neural Representation via Bit Plane Decomposition | Woo Kyoung Han et.al. | 2502.21001 | link |
2025-02-28 | LesionLocator: Zero-Shot Universal Tumor Segmentation and Tracking in 3D Whole-Body Imaging | Maximilian Rokuss et.al. | 2502.20985 | null |
2025-02-28 | Towards Practical Real-Time Neural Video Compression | Zhaoyang Jia et.al. | 2502.20762 | link |
2025-02-27 | Balanced Rate-Distortion Optimization in Learned Image Compression | Yichi Zhang et.al. | 2502.20161 | link |
2025-02-27 | Transformer-Based Nonlinear Transform Coding for Multi-Rate CSI Compression in MIMO-OFDM Systems | Bumsu Park et.al. | 2502.19847 | null |
2025-02-26 | Zipping many-body quantum states: a scalable approach to diagonal entropy | Yu-Hsueh Chen et.al. | 2502.18898 | null |
2025-02-25 | Novel quantum circuit for image compression utilizing modified Toffoli gate and quantized transformed coefficient alongside a novel reset gate | Ershadul Haque et.al. | 2502.17815 | null |
2025-02-25 | Quantum neural compressive sensing for ghost imaging | Xinliang Zhai et.al. | 2502.17790 | null |
2025-02-24 | Optimized Memory System Architecture for VESA VDC-M Decoder with Multi-Slice Support | Hannah Yang et.al. | 2502.17729 | null |
2025-02-24 | Pleno-Generation: A Scalable Generative Face Video Compression Framework with Bandwidth Intelligence | Bolin Chen et.al. | 2502.17085 | null |
2025-02-24 | Hierarchical Semantic Compression for Consistent Image Semantic Restoration | Shengxi Li et.al. | 2502.16799 | null |
2025-02-24 | Continuous Patch Stitching for Block-wise Image Compression | Zifu Zhang et.al. | 2502.16795 | null |
2025-02-27 | Orchestrating Joint Offloading and Scheduling for Low-Latency Edge SLAM | Yao Zhang et.al. | 2502.16495 | null |
2025-02-22 | Large Language Model for Lossless Image Compression with Visual Prompts | Junhao Du et.al. | 2502.16163 | null |
2025-02-21 | Quantum autoencoders for image classification | Hinako Asaoka et.al. | 2502.15254 | null |
2025-02-21 | Interleaved Block-based Learned Image Compression with Feature Enhancement and Quantization Error Compensation | Shiqi Jiang et.al. | 2502.15188 | null |
2025-02-21 | FD-LSCIC: Frequency Decomposition-based Learned Screen Content Image Compression | Shiqi Jiang et.al. | 2502.15174 | null |
2025-02-20 | Compact Latent Representation for Image Compression (CLRIC) | Ayman A. Ameen et.al. | 2502.14937 | null |
2025-02-20 | Stereo Image Coding for Machines with Joint Visual Feature Compression | Dengchao Jin et.al. | 2502.14190 | null |
2025-02-19 | A General Framework for Augmenting Lossy Compressors with Topological Guarantees | Nathaniel Gorski et.al. | 2502.14022 | null |
2025-02-19 | A Lightweight Model for Perceptual Image Compression via Implicit Priors | Hao Wei et.al. | 2502.13988 | null |
2025-02-19 | Improving the Sparse Structure Learning of Spiking Neural Networks from the View of Compression Efficiency | Jiangrong Shen et.al. | 2502.13572 | null |
2025-02-18 | Guaranteed Conditional Diffusion: 3D Block-based Models for Scientific Data Compression | Jaemoon Lee et.al. | 2502.12951 | null |
2025-02-17 | Fully Dynamic LZ77 in Sublinear Time | Itai Boneh et.al. | 2502.12000 | null |
2025-02-17 | On Quantizing Neural Representation for Variable-Rate Video Coding | Junqi Shi et.al. | 2502.11729 | link |
2025-02-15 | AquaScope: Reliable Underwater Image Transmission on Mobile Devices | Beitong Tian et.al. | 2502.10891 | null |
2025-02-15 | ResiComp: Loss-Resilient Image Compression via Dual-Functional Masked Visual Token Modeling | Sixian Wang et.al. | 2502.10812 | null |
2025-02-15 | A Fast Quantum Image Compression Algorithm based on Taylor Expansion | Vu Tuan Hai et.al. | 2502.10684 | link |
2025-02-15 | Optimizing CNN Architectures for Advanced Thoracic Disease Classification | Tejas Mirthipati et.al. | 2502.10614 | null |
2025-02-14 | Conditional Latent Coding with Learnable Synthesized Reference for Deep Image Compression | Siqi Wu et.al. | 2502.09971 | null |
2025-02-13 | Differentially Private Compression and the Sensitivity of LZ77 | Jeremiah Blocki et.al. | 2502.09584 | null |
2025-02-13 | SQ-GAN: Semantic Image Communications Using Masked Vector Quantization | Francesco Pezone et.al. | 2502.09520 | link |
2025-02-13 | Large Images are Gaussians: High-Quality Large Image Representation with Levels of 2D Gaussian Splatting | Lingting Zhu et.al. | 2502.09039 | link |
2025-02-12 | Compression of Site-Specific Deep Neural Networks for Massive MIMO Precoding | Ghazal Kasalaee et.al. | 2502.08758 | null |
2025-02-11 | To clean or not to clean? Influence of pixel removal on event reconstruction using deep learning in CTAO | Tom François et.al. | 2502.07643 | null |
2025-02-19 | HDCompression: Hybrid-Diffusion Image Compression for Ultra-Low Bitrates | Lei Lu et.al. | 2502.07160 | null |
2025-02-12 | Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT | Dongyang Liu et.al. | 2502.06782 | null |
2025-02-10 | Solving Optimal Power Flow on a Data-Budget: Feature Selection on Smart Meter Data | Vassilis Kekatos et.al. | 2502.06683 | null |
2025-02-13 | CANeRV: Content Adaptive Neural Representation for Video Compression | Lv Tang et.al. | 2502.06181 | null |
2025-02-09 | Online Reward-Weighted Fine-Tuning of Flow Matching with Wasserstein Regularization | Jiajun Fan et.al. | 2502.06061 | null |
2025-02-09 | Constant sensitivity on the CDAWGs | Rikuya Hamai et.al. | 2502.05915 | null |
2025-02-09 | Linear Attention Modeling for Learned Image Compression | Donghui Feng et.al. | 2502.05741 | null |
2025-02-08 | Convolutional Deep Colorization for Image Compression: A Color Grid Based Approach | Ian Tassin et.al. | 2502.05402 | null |
2025-02-07 | CMamba: Learned Image Compression with State Space Models | Zhuojie Wu et.al. | 2502.04988 | null |
2025-02-06 | Semantic Feature Division Multiple Access for Digital Semantic Broadcast Channels | Shuai Ma et.al. | 2502.03949 | null |
2025-02-06 | Enhancing Online Learning Efficiency Through Heterogeneous Resource Integration with a Multi-Agent RAG System | Devansh Srivastav et.al. | 2502.03948 | null |
2025-02-05 | All-in-One Image Compression and Restoration | Huimin Zeng et.al. | 2502.03649 | link |
2025-02-05 | Towards characterizing dark matter subhalo perturbations in stellar streams with graph neural networks | Peter Xiangyuan Ma et.al. | 2502.03522 | null |
2025-02-05 | LED there be DoS: Exploiting variable bitrate IP cameras for network DoS | Emmanuel Goldberg et.al. | 2502.03177 | null |
2025-02-04 | On likelihood-based analysis of the gravitationally (de)lensed CMB | Julien Carron et.al. | 2502.02399 | null |
2025-02-04 | PALQA: A Novel Parameterized Position-Aware Lossy Quantum Autoencoder using LSB Control Qubit for Efficient Image Compression | Ershadul Haque et.al. | 2502.02188 | null |
2025-02-01 | Semantic Communication based on Generative AI: A New Approach to Image Compression and Edge Optimization | Francesco Pezone et.al. | 2502.01675 | null |
2025-02-10 | Compressed Image Generation with Denoising Diffusion Codebook Models | Guy Ohayon et.al. | 2502.01189 | null |
2025-02-02 | S2CFormer: Reorienting Learned Image Compression from Spatial Interaction to Channel Aggregation | Yunuo Chen et.al. | 2502.00700 | null |
2025-01-28 | Rate-Distortion under Neural Tracking of Speech: A Directed Redundancy Approach | Jan Østergaard et.al. | 2501.16762 | null |
2025-02-05 | Hybrid Quantum Neural Networks with Amplitude Encoding: Advancing Recovery Rate Predictions | Ying Chen et.al. | 2501.15828 | null |
2025-01-23 | The Redundancy of Non-Singular Channel Simulation | Gergely Flamich et.al. | 2501.14053 | null |
2025-02-01 | On Disentangled Training for Nonlinear Transform in Learned Image Compression | Han Li et.al. | 2501.13751 | link |
2025-01-23 | Diffusion-based Perceptual Neural Video Compression with Temporal Diffusion Information Reuse | Wenzhuo Ma et.al. | 2501.13528 | null |
2025-01-22 | Using simulation based inference on tidally perturbed dwarf galaxies: the dynamics of NGC205 | Axel Widmark et.al. | 2501.13148 | null |
2025-01-22 | Nonlinear reduction strategies for data compression: a comprehensive comparison from diffusion to advection problems | Isabella Carla Gonnella et.al. | 2501.12816 | null |
2025-01-22 | Entropy Polarization-Based Data Compression Without Frozen Set Construction | Zichang Ren et.al. | 2501.12584 | null |
2025-01-21 | The Gap Between Principle and Practice of Lossy Image Coding | Haotian Zhang et.al. | 2501.12330 | null |
2025-01-21 | RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression | Uri Gadot et.al. | 2501.12216 | null |
2025-01-20 | Efficient Bearing Sensor Data Compression via an Asymmetrical Autoencoder with a Lifting Wavelet Transform Layer | Xin Zhu et.al. | 2501.11737 | null |
2025-01-20 | Towards Loss-Resilient Image Coding for Unstable Satellite Networks | Hongwei Sha et.al. | 2501.11263 | null |
2025-01-18 | Mathematical model of parameters relevance in adaptive level-crossing sampling for electrocardiogram signals | Silvio Zanoli et.al. | 2501.10829 | null |
2025-01-30 | Lossless data compression at pragmatic rates | Andreas Theocharous et.al. | 2501.10103 | null |
2025-01-17 | Multi-Modal Attention Networks for Enhanced Segmentation and Depth Estimation of Subsurface Defects in Pulse Thermography | Mohammed Salah et.al. | 2501.09994 | link |
2025-01-31 | A Simple Aerial Detection Baseline of Multimodal Language Models | Qingyun Li et.al. | 2501.09720 | link |
2025-01-16 | Split Fine-Tuning for Large Language Models in Wireless Networks | Songge Zhang et.al. | 2501.09237 | null |
2025-01-13 | Motion Tracks: A Unified Representation for Human-Robot Transfer in Few-Shot Imitation Learning | Juntao Ren et.al. | 2501.06994 | null |
2025-01-12 | A General Framework for Error-controlled Unstructured Scientific Data Compression | Qian Gong et.al. | 2501.06910 | null |
2025-01-10 | From My View to Yours: Ego-Augmented Learning in Large Vision Language Models for Understanding Exocentric Daily Living Activities | Dominick Reilly et.al. | 2501.05711 | link |
2025-01-09 | Neural Architecture Codesign for Fast Physics Applications | Jason Weitz et.al. | 2501.05515 | link |
2025-01-09 | Principles and Metrics of Extreme Learning Machines Using a Highly Nonlinear Fiber | Mathilde Hary et.al. | 2501.05233 | null |
2025-01-09 | Emergence of Painting Ability via Recognition-Driven Evolution | Yi Lin et.al. | 2501.04966 | null |
2025-01-08 | GaussianVideo: Efficient Video Representation via Hierarchical Gaussian Splatting | Andrew Bond et.al. | 2501.04782 | null |
2025-01-08 | Unified Coding for Both Human Perception and Generalized Machine Analytics with CLIP Supervision | Kangsheng Yin et.al. | 2501.04579 | link |
2025-01-08 | An Efficient Adaptive Compression Method for Human Perception and Machine Vision Tasks | Lei Liu et.al. | 2501.04329 | null |
2025-01-03 | Listening and Seeing Again: Generative Error Correction for Audio-Visual Speech Recognition | Rui Liu et.al. | 2501.04038 | link |
2024-12-24 | MERCURY: A fast and versatile multi-resolution based global emulator of compound climate hazards | Shruti Nath et.al. | 2501.04018 | link |
2025-01-06 | A Novel Structure-Agnostic Multi-Objective Approach for Weight-Sharing Compression in Deep Neural Networks | Rasa Khosrowshahli et.al. | 2501.03095 | null |
2025-01-06 | Region of Interest based Medical Image Compression | Utkarsh Prakash Srivastava et.al. | 2501.02895 | null |
2025-01-06 | Constructing 4D Radio Map in LEO Satellite Networks with Limited Samples | Haoxuan Yuan et.al. | 2501.02775 | null |
2025-01-06 | Artificial Intelligence in Creative Industries: Advances Prior to 2025 | Nantheera Anantrasirichai et.al. | 2501.02725 | null |
2025-01-05 | Remote Inference over Dynamic Links via Adaptive Rate Deep Task-Oriented Vector Quantization | Eyal Fishel et.al. | 2501.02521 | link |
2025-01-17 | MetaNeRV: Meta Neural Representations for Videos with Spatial-Temporal Guidance | Jialong Guo et.al. | 2501.02427 | null |
2025-01-03 | Compressed Domain Prior-Guided Video Super-Resolution for Cloud Gaming Content | Qizhe Wang et.al. | 2501.01773 | null |
2025-01-01 | CoordFlow: Coordinate Flow for Pixel-wise Neural Video Representation | Daniel Silver et.al. | 2501.00975 | null |
2025-01-01 | Gradient Compression and Correlation Driven Federated Learning for Wireless Traffic Prediction | Chuanting Zhang et.al. | 2501.00732 | link |
2025-01-07 | Rapid, High-resolution and Distortion-free |
Xiaoqing Wang et.al. | 2501.00256 | link |
2024-12-29 | Distributed Hybrid Sketching for |
Neophytos Charalambides et.al. | 2412.20301 | null |
2024-12-19 | Quantum Implicit Neural Compression | Takuya Fujihashi et.al. | 2412.19828 | null |
2024-12-25 | Adaptive Rate Control for Deep Video Compression with Rate-Distortion Prediction | Bowen Gu et.al. | 2412.18834 | null |
2024-12-24 | Ultra-Low Complexity On-Orbit Compression for Remote Sensing Imagery via Block Modulated Imaging | Zhibin Wang et.al. | 2412.18417 | link |
2024-12-24 | Semantics Disentanglement and Composition for Versatile Codec toward both Human-eye Perception and Machine Vision Task | Jinming Liu et.al. | 2412.18158 | null |
2024-12-23 | CALLIC: Content Adaptive Learning for Lossless Image Compression | Daxin Li et.al. | 2412.17464 | null |
2024-12-23 | AsymLLIC: Asymmetric Lightweight Learned Image Compression | Shen Wang et.al. | 2412.17270 | null |
2024-12-22 | Foundation Model for Lossy Compression of Spatiotemporal Scientific Data | Xiao Li et.al. | 2412.17184 | null |
2024-12-24 | L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression | Junxuan Zhang et.al. | 2412.16642 | link |
2024-12-20 | Schmidt quantum compressor | Israel F. Araujo et.al. | 2412.16337 | null |
2024-12-20 | Sparse Point Clouds Assisted Learned Image Compression | Yiheng Jiang et.al. | 2412.15752 | null |
2024-12-18 | Super-Resolution Generative Adversarial Network for Data Compression of Direct Numerical Simulations | Ludovico Nista et.al. | 2412.14150 | null |
2024-12-18 | Efficient high performance computing with the ALICE Event Processing Nodes GPU-based farm | Federico Ronchetti et.al. | 2412.13755 | null |
2024-12-18 | Robust UAV Jittering and Task Scheduling in Mobile Edge Computing with Data Compression | Bin Li et.al. | 2412.13676 | null |
2024-12-18 | DarkIR: Robust Low-Light Image Restoration | Daniel Feijoo et.al. | 2412.13443 | link |
2024-12-17 | Identifying Bias in Deep Neural Networks Using Image Transforms | Sai Teja Erukude et.al. | 2412.13079 | link |
2024-12-17 | Stable Diffusion is a Natural Cross-Modal Decoder for Layered AI-generated Image Compression | Ruijie Chen et.al. | 2412.12982 | null |
2024-12-17 | Invisible Watermarks: Attacks and Robustness | Dongjun Hwang et.al. | 2412.12511 | link |
2024-12-16 | Representation learning for fast radio burst dynamic spectra | Dirk Kuiper et.al. | 2412.12394 | link |
2024-12-16 | Point Cloud-Assisted Neural Image Compression | Ziqun Li et.al. | 2412.11771 | null |
2024-12-16 | Whisper-GPT: A Hybrid Representation Audio Large Language Model | Prateek Verma et.al. | 2412.11449 | null |
2024-12-16 | Controllable Distortion-Perception Tradeoff Through Latent Diffusion for Neural Image Compression | Chuqin Zhou et.al. | 2412.11379 | null |
2024-12-16 | VRVVC: Variable-Rate NeRF-Based Volumetric Video Compression | Qiang Hu et.al. | 2412.11362 | null |
2024-12-14 | Progressive Compression with Universally Quantized Diffusion Models | Yibo Yang et.al. | 2412.10935 | null |
2024-12-14 | Learned Data Compression: Challenges and Opportunities for the Future | Qiyu Liu et.al. | 2412.10770 | null |
2024-12-11 | Implicit Neural Compression of Point Clouds | Hongning Ruan et.al. | 2412.10433 | null |
2024-12-12 | Video Seal: Open and Efficient Video Watermarking | Pierre Fernandez et.al. | 2412.09492 | link |
2024-12-12 | Learned Compression for Compressed Learning | Dan Jacobellis et.al. | 2412.09405 | link |
2024-12-12 | Versatile Volumetric Medical Image Coding for Human-Machine Vision | Jietao Chen et.al. | 2412.09231 | null |
2024-12-11 | Unicorn: Unified Neural Image Compression with One Number Reconstruction | Qi Zheng et.al. | 2412.08210 | null |
2024-12-09 | Splatter-360: Generalizable 360 |
Zheng Chen et.al. | 2412.06250 | link |
2024-12-08 | Vision Transformer-based Semantic Communications With Importance-Aware Quantization | Joohyuk Park et.al. | 2412.06038 | null |
2024-12-08 | Matrix Pre-orthogonal-Matching Pursuit as a Fundamental AI Algorithm | Wei Qu et.al. | 2412.05878 | null |
2024-12-09 | UniMIC: Towards Universal Multi-modality Perceptual Image Compression | Yixin Gao et.al. | 2412.04912 | null |
2024-12-05 | Solving High-dimensional Inverse Problems Using Amortized Likelihood-free Inference with Noisy and Incomplete Data | Jice Zeng et.al. | 2412.04565 | null |
2024-12-05 | Diagnosing Systematic Effects Using the Inferred Initial Power Spectrum | Tristan Hoellinger et.al. | 2412.04443 | null |
2024-12-05 | Multi-Scale Node Embeddings for Graph Modeling and Generation | Riccardo Milocco et.al. | 2412.04354 | null |
2024-12-05 | Feature Coding in the Era of Large Models: Dataset, Test Conditions, and Benchmark | Changsheng Gao et.al. | 2412.04307 | link |
2024-12-05 | LL-ICM: Image Compression for Low-level Machine Vision via Large Vision-Language Model | Yuan Xue et.al. | 2412.03841 | null |
2024-12-04 | Electrocardiogram-based diagnosis of liver diseases: an externally validated and explainable machine learning approach | Juan Miguel Lopez Alcaraz et.al. | 2412.03717 | link |
2024-12-04 | Is JPEG AI going to change image forensics? | Edoardo Daniele Cannas et.al. | 2412.03261 | link |
2024-12-03 | Efficient Algorithms for Low Tubal Rank Tensor Approximation with Applications to Image Compression, Super-Resolution and Deep Learning | Salman Ahmadi-Asl et.al. | 2412.02598 | null |
2024-12-03 | Randomized algorithms for Kroncecker tensor decomposition and applications | Salman Ahmadi-Asl et.al. | 2412.02597 | null |
2024-12-03 | Efficient Model Compression Techniques with FishLeg | Jamie McGowan et.al. | 2412.02328 | null |
2024-12-02 | Efficient Compression of Sparse Accelerator Data Using Implicit Neural Representations and Importance Sampling | Xihaier Luo et.al. | 2412.01754 | link |
2024-12-02 | Robust and Transferable Backdoor Attacks Against Deep Image Compression With Selective Frequency Prior | Yi Yu et.al. | 2412.01646 | null |
2024-12-01 | Construction of generalized samplets in Banach spaces | Peter Balazs et.al. | 2412.00954 | null |
2024-11-30 | Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion | Jona Ballé et.al. | 2412.00505 | null |
2024-11-30 | Hybrid Local-Global Context Learning for Neural Video Compression | Yongqi Zhai et.al. | 2412.00446 | null |
2024-11-30 | DeepFGS: Fine-Grained Scalable Coding for Learned Image Compression | Yongqi Zhai et.al. | 2412.00437 | null |
2024-11-29 | AIDetx: a compression-based method for identification of machine-learning generated text | Leonardo Almeida et.al. | 2411.19869 | link |
2024-11-29 | Memristive Nanowire Network for Energy Efficient Audio Classification: Pre-Processing-Free Reservoir Computing with Reduced Latency | Akshaya Rajesh et.al. | 2411.19611 | null |
2024-11-29 | MCUCoder: Adaptive Bitrate Learned Video Compression for IoT Devices | Ali Hojjat et.al. | 2411.19442 | link |
2024-11-28 | Generalized Gaussian Model for Learned Image Compression | Haotian Zhang et.al. | 2411.19320 | null |
2024-11-28 | Upsampling Improvement for Overfitted Neural Coding | Pierrick Philippe et.al. | 2411.19249 | null |
2024-11-27 | Learning Optimal Linear Block Transform by Rate Distortion Minimization | Alessandro Gnutti et.al. | 2411.18494 | null |
2024-11-27 | HEMGS: A Hybrid Entropy Model for 3D Gaussian Splatting Data Compression | Lei Liu et.al. | 2411.18473 | null |
2024-11-26 | Evaluating the Overhead of the Performance Profiler Cloudprofiler With MooBench | Shinhyung Yang et.al. | 2411.17413 | null |
2024-11-26 | Motion Free B-frame Coding for Neural Video Compression | Van Thang Nguyen et.al. | 2411.17160 | null |
2024-11-30 | An Information-Theoretic Regularizer for Lossy Neural Image Compression | Yingwen Zhang et.al. | 2411.16727 | null |
2024-11-25 | WTDUN: Wavelet Tree-Structured Sampling and Deep Unfolding Network for Image Compressed Sensing | Kai Han et.al. | 2411.16336 | null |
2024-11-25 | Learning Optimal Lattice Vector Quantizers for End-to-end Neural Image Compression | Xi Zhang et.al. | 2411.16119 | null |
2024-11-25 | TransCompressor: LLM-Powered Multimodal Data Compression for Smart Transportation | Huanqi Yang et.al. | 2411.16020 | null |
2024-11-24 | Variable-size Symmetry-based Graph Fourier Transforms for image compression | Alessandro Gnutti et.al. | 2411.15824 | null |
2024-11-24 | M3-CVC: Controllable Video Compression with Multimodal Generative Models | Rui Wan et.al. | 2411.15798 | null |
2024-11-24 | Advanced Learning-Based Inter Prediction for Future Video Coding | Yanchen Zhao et.al. | 2411.15759 | null |
2024-11-24 | PEnG: Pose-Enhanced Geo-Localisation | Tavis Shore et.al. | 2411.15742 | link |
2024-11-21 | U-Motion: Learned Point Cloud Video Compression with U-Structured Motion Estimation | Tingyu Fan et.al. | 2411.14501 | null |
2024-11-21 | Differentiable SVD based on Moore-Penrose Pseudoinverse for Inverse Imaging Problems | Yinghao Zhang et.al. | 2411.14141 | link |
2024-11-21 | Compact Visual Data Representation for Green Multimedia -- A Human Visual System Perspective | Peilin Chen et.al. | 2411.14135 | null |
2024-11-27 | Image Compression Using Novel View Synthesis Priors | Luyuan Peng et.al. | 2411.13862 | null |
2024-11-20 | Sparse Input View Synthesis: 3D Representations and Reliable Priors | Nagabhushan Somraj et.al. | 2411.13631 | null |
2024-11-20 | Benchmarking Quantum Convolutional Neural Networks for Classification and Data Compression Tasks | Jun Yong Khoo et.al. | 2411.13468 | null |
2024-11-20 | Practical Compact Deep Compressed Sensing | Bin Chen et.al. | 2411.13081 | link |
2024-11-20 | LMM-driven Semantic Image-Text Coding for Ultra Low-bitrate Learned Image Compression | Shimon Murai et.al. | 2411.13033 | link |
2024-11-22 | Large Language Models for Lossless Image Compression: Next-Pixel Prediction in Language Space is All You Need | Kecheng Chen et.al. | 2411.12448 | null |
2024-11-19 | Breathless: An 8-hour Performance Contrasting Human and Robot Expressiveness | Catie Cuan et.al. | 2411.12361 | null |
2024-11-18 | Variable Rate Neural Compression for Sparse Detector Data | Yi Huang et.al. | 2411.11942 | link |
2024-11-18 | Exploring adversarial robustness of JPEG AI: methodology, comparison and new methods | Egor Kovalev et.al. | 2411.11795 | null |
2024-11-18 | Additional Tests for TV 3.0 | Eduardo Peixoto et.al. | 2411.11755 | null |
2024-11-18 | Towards fast DBSCAN via Spectrum-Preserving Data Compression | Yongyu Wang et.al. | 2411.11421 | null |
2024-11-17 | BVI-CR: A Multi-View Human Dataset for Volumetric Video Compression | Ge Gao et.al. | 2411.11199 | link |
2024-11-16 | An End-to-End Real-World Camera Imaging Pipeline | Kepeng Xu et.al. | 2411.10773 | null |
2024-11-16 | Deep Learning-Based Image Compression for Wireless Communications: Impacts on Reliability,Throughput, and Latency | Mostafa Naseri et.al. | 2411.10650 | link |
2024-11-15 | Efficient Progressive Image Compression with Variance-aware Masking | Alberto Presta et.al. | 2411.10185 | link |
2024-11-15 | A Multi-Scale Spatial-Temporal Network for Wireless Video Transmission | Xinyi Zhou et.al. | 2411.09936 | null |
2024-11-14 | Application of signal separation to diffraction image compression and serial crystallography | Jérôme Kieffer et.al. | 2411.09515 | link |
2024-11-14 | DT-JRD: Deep Transformer based Just Recognizable Difference Prediction Model for Video Coding for Machines | Junqi Liu et.al. | 2411.09308 | null |
2024-11-14 | Towards efficient compression and communication for prototype-based decentralized learning | Pablo Fernández-Piñeiro et.al. | 2411.09267 | null |
2024-11-13 | Learning Optimal and Interpretable Summary Statistics of Galaxy Catalogs with SBI | Kai Lehman et.al. | 2411.08957 | null |
2024-11-13 | LSH-MoE: Communication-efficient MoE Training via Locality-Sensitive Hashing | Xiaonan Nie et.al. | 2411.08446 | null |
2024-11-18 | Rendering-Oriented 3D Point Cloud Attribute Compression using Sparse Tensor-based Transformer | Xiao Huo et.al. | 2411.07899 | null |
2024-11-11 | Accelerating radio astronomy imaging with RICK | Emanuele De Rubeis et.al. | 2411.07321 | link |
2024-11-11 | Low Complexity Learning-based Lossless Event-based Compression | Ahmadreza Sezavar et.al. | 2411.07155 | null |
2024-11-11 | JPEG AI Image Compression Visual Artifacts: Detection Methods and Dataset | Daria Tsereh et.al. | 2411.06810 | null |
2024-11-11 | Machine vision-aware quality metrics for compressed image and video assessment | Mikhail Dremin et.al. | 2411.06776 | null |
2024-11-11 | High-Frequency Enhanced Hybrid Neural Representation for Video Compression | Li Yu et.al. | 2411.06685 | null |
2024-11-09 | HiHa: Introducing Hierarchical Harmonic Decomposition to Implicit Neural Compression for Atmospheric Data | Zhewen Xu et.al. | 2411.06155 | null |
2024-11-08 | A method based on Generative Adversarial Networks for disentangling physical and chemical properties of stars in astronomical spectra | Raúl Santoveña et.al. | 2411.05960 | null |
2024-11-07 | Don't Look Twice: Faster Video Transformers with Run-Length Tokenization | Rohan Choudhury et.al. | 2411.05222 | null |
2024-11-05 | Tuning into spatial frequency space: Satellite and space debris detection in the ZTF alert stream | J. P. Carvajal et.al. | 2411.03258 | null |
2024-11-15 | ZipCache: A DRAM/SSD Cache with Built-in Transparent Compression | Rui Xie et.al. | 2411.03174 | null |
2024-11-05 | Learning-based Lossless Event Data Compression | Ahmadreza Sezavar et.al. | 2411.03010 | null |
2024-11-04 | Neural optical flow for planar and stereo PIV | Andrew I. Masker et.al. | 2411.02373 | null |
2024-11-04 | The evolution of volumetric video: A survey of smart transcoding and compression approaches | Preetish Kakkar et.al. | 2411.02095 | null |
2024-11-03 | Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision | Xiangzhong Luo et.al. | 2411.01431 | null |
2024-11-02 | Autoencoders for At-Source Data Reduction and Anomaly Detection in High Energy Particle Detectors | Alexander Yue et.al. | 2411.01118 | null |
2024-11-01 | SANN-PSZ: Spatially Adaptive Neural Network for Head-Tracked Personal Sound Zones | Yue Qiao et.al. | 2411.00772 | null |
2024-10-28 | MultiTok: Variable-Length Tokenization for Efficient LLMs Adapted from LZW Compression | Noel Elias et.al. | 2410.21548 | link |
2024-10-29 | Enhancing Learned Image Compression via Cross Window-based Attention | Priyanka Mudgal et.al. | 2410.21144 | link |
2024-10-26 | Cross-Platform Neural Video Coding: A Case Study | Ruhan Conceição et.al. | 2410.20145 | null |
2024-10-25 | Conditional Hallucinations for Image Compression | Till Aczel et.al. | 2410.19493 | null |
2024-10-29 | Integration of Communication and Computational Imaging | Zhenming Yu et.al. | 2410.19415 | null |
2024-10-24 | DMVC: Multi-Camera Video Compression Network aimed at Improving Deep Learning Accuracy | Huan Cui et.al. | 2410.18400 | null |
2024-10-23 | Predicting total time to compress a video corpus using online inference systems | Xin Shu et.al. | 2410.18260 | null |
2024-10-23 | FIPER: Generalizable Factorized Fields for Joint Image Compression and Super-Resolution | Yang-Che Sun et.al. | 2410.18083 | null |
2024-10-23 | Learning Lossless Compression for High Bit-Depth Volumetric Medical Image | Kai Wang et.al. | 2410.17814 | null |
2024-10-21 | Variable Rate Learned Wavelet Video Coding with Temporal Layer Adaptivity | Anna Meyer et.al. | 2410.15873 | link |
2024-10-20 | Extensions on low-complexity DCT approximations for larger blocklengths based on minimal angle similarity | A. P. Radünz et.al. | 2410.15244 | null |
2024-10-19 | Standardizing Generative Face Video Compression using Supplemental Enhancement Information | Bolin Chen et.al. | 2410.15105 | null |
2024-10-16 | MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection | Bokai Lin et.al. | 2410.14731 | null |
2024-10-18 | Design and Prototype of a Unified Framework for Error-robust Compression and Encryption in IoT | Gajraj Kuldeep et.al. | 2410.14396 | null |
2024-10-18 | Compression using Discrete Multi-Level Divisor Transform for Heterogeneous Sensor Data | Gajraj Kuldeep et.al. | 2410.14287 | null |
2024-10-17 | In-context learning and Occam's razor | Eric Elmoznino et.al. | 2410.14086 | link |
2024-10-17 | Co-Segmentation without any Pixel-level Supervision with Application to Large-Scale Sketch Classification | Nikolaos-Antonios Ypsilantis et.al. | 2410.13582 | null |
2024-10-16 | Test-time adaptation for image compression with distribution regularization | Kecheng Chen et.al. | 2410.12191 | null |
2024-10-16 | Joint Data Compression, Secure Multi-Part Collaborative Task Offloading and Resource Assignment in Ultra-Dense Networks | Tianqing Zhou et.al. | 2410.12186 | null |
2024-10-14 | Large Language Model Evaluation via Matrix Nuclear-Norm | Yahan Li et.al. | 2410.10672 | link |
2024-10-14 | QIANets: Quantum-Integrated Adaptive Networks for Reduced Latency and Improved Inference Times in CNN Models | Zhumazhan Balapanov et.al. | 2410.10318 | link |
2024-10-14 | Generative Human Video Compression with Multi-granularity Temporal Trajectory Factorization | Shanzhi Yin et.al. | 2410.10171 | null |
2024-10-13 | Towards Reproducible Learning-based Compression | Jiahao Pang et.al. | 2410.09872 | null |
2024-10-13 | Compressing Scene Dynamics: A Generative Approach | Shanzhi Yin et.al. | 2410.09768 | link |
2024-10-13 | ECVC: Exploiting Non-Local Correlations in Multiple Frames for Contextual Video Compression | Wei Jiang et.al. | 2410.09706 | link |
2024-10-12 | Fine-grained subjective visual quality assessment for high-fidelity compressed images | Michela Testolina et.al. | 2410.09501 | link |
2024-10-11 | Fast Data-independent KLT Approximations Based on Integer Functions | A. P. Radünz et.al. | 2410.09227 | null |
2024-10-10 | Compressing high-resolution data through latent representation encoding for downscaling large-scale AI weather forecast model | Qian Liu et.al. | 2410.09109 | null |
2024-10-11 | Data-Driven Neural Estimation of Indirect Rate-Distortion Function | Zichao Yu et.al. | 2410.09018 | null |
2024-10-11 | Compressing regularised dynamics improves link prediction in sparse networks | Maja Lindström et.al. | 2410.08777 | link |
2024-10-11 | Beyond GFVC: A Progressive Face Video Compression Framework with Adaptive Visual Tokens | Bolin Chen et.al. | 2410.08485 | link |
2024-10-10 | What is Left After Distillation? How Knowledge Transfer Impacts Fairness and Bias | Aida Mohammadshahi et.al. | 2410.08407 | null |
2024-10-16 | Delta-ICM: Entropy Modeling with Delta Function for Learned Image Compression | Takahiro Shindo et.al. | 2410.07669 | null |
2024-10-10 | MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion | Onkar Susladkar et.al. | 2410.07659 | link |
2024-10-10 | R-Adaptive Mesh Optimization to Enhance Finite Element Basis Compression | Graham Harper et.al. | 2410.07646 | null |
2024-10-09 | JPEG Inspired Deep Learning | Ahmed H. Salamah et.al. | 2410.07081 | link |
2024-10-09 | SHRINK: Data Compression by Semantic Extraction and Residuals Encoding | Guoyou Sun et.al. | 2410.06713 | null |
2024-10-09 | Convex Distillation: Efficient Compression of Deep Networks via Convex Optimization | Prateek Varshney et.al. | 2410.06567 | null |
2024-10-09 | Efficient and Robust Knowledge Distillation from A Stronger Teacher Based on Correlation Matching | Wenqi Niu et.al. | 2410.06561 | null |
2024-10-08 | Covering Numbers for Deep ReLU Networks with Applications to Function Approximation and Nonparametric Regression | Weigutian Ou et.al. | 2410.06378 | null |
2024-10-08 | Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach | Sha Guo et.al. | 2410.06149 | null |
2024-10-08 | Resolution limit of the eye: how many pixels can we see? | Maliha Ashraf et.al. | 2410.06068 | null |
2024-10-07 | Transformers learn variable-order Markov chains in-context | Ruida Zhou et.al. | 2410.05493 | null |
2024-10-07 | Salient Store: Enabling Smart Storage for Continuous Learning Edge Servers | Cyan Subhra Mishra et.al. | 2410.05435 | null |
2024-10-07 | Causal Context Adjustment Loss for Learned Image Compression | Minghao Han et.al. | 2410.04847 | link |
2024-10-06 | Channel-Aware Throughput Maximization for Cooperative Data Fusion in CAV | Haonan An et.al. | 2410.04320 | null |
2024-10-05 | Robust Task-Oriented Communication Framework for Real-Time Collaborative Vision Perception | Zhengru Fang et.al. | 2410.04168 | link |
2024-10-04 | On the Rate-Distortion-Complexity Trade-offs of Neural Video Coding | Yi-Hsin Chen et.al. | 2410.03898 | null |
2024-10-04 | A Framework for Automatic Validation and Application of Lossy Data Compression in Ensemble Data Assimilation | Kai Keller et.al. | 2410.03184 | null |
2024-10-03 | GABIC: Graph-based Attention Block for Image Compression | Gabriele Spadaro et.al. | 2410.02981 | link |
2024-10-03 | Diffusion-based Extreme Image Compression with Compressed Feature Initialization | Zhiyuan Li et.al. | 2410.02640 | link |
2024-10-03 | High-Efficiency Neural Video Compression via Hierarchical Predictive Learning | Ming Lu et.al. | 2410.02598 | link |
2024-10-02 | A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation | Liang Chen et.al. | 2410.01912 | link |
2024-10-02 | COSMIC: Compress Satellite Images Efficiently via Diffusion Compensation | Ziyuan Zhang et.al. | 2410.01698 | link |
2024-10-03 | Releasing the Parameter Latency of Neural Representation for High-Efficiency Video Compression | Gai Zhang et.al. | 2410.01654 | null |
2024-10-02 | Task-Oriented Edge-Assisted Cooperative Data Compression, Communications and Computing for UGV-Enhanced Warehouse Logistics | Jiaming Yang et.al. | 2410.01515 | null |
2024-10-01 | STanH : Parametric Quantization for Variable Rate Learned Image Compression | Alberto Presta et.al. | 2410.00557 | null |
2024-09-30 | LaMMA-P: Generalizable Multi-Agent Long-Horizon Task Allocation and Planning with LM-Driven PDDL Planner | Xiaopan Zhang et.al. | 2409.20560 | null |
2024-09-30 | PerCo (SD): Open Perceptual Compression | Nikolai Körber et.al. | 2409.20255 | link |
2024-09-29 | All-in-One Image Coding for Joint Human-Machine Vision with Multi-Path Aggregation | Xu Zhang et.al. | 2409.19660 | link |
2024-09-28 | Fast Encoding and Decoding for Implicit Video Representation | Hao Chen et.al. | 2409.19429 | null |
2024-09-27 | Learning-Based Image Compression for Machines | Kartik Gupta et.al. | 2409.19184 | link |
2024-09-27 | Effectiveness of learning-based image codecs on fingerprint storage | Daniele Mari et.al. | 2409.18730 | link |
2024-09-27 | Decoding Complexity-Rate-Quality Pareto-Front for Adaptive VVC Streaming | Angeliki Katsenou et.al. | 2409.18713 | null |
2024-09-27 | Neural Video Representation for Redundancy Reduction and Consistency Preservation | Taiga Hayami et.al. | 2409.18497 | null |
2024-09-20 | Blockchain-Enabled Variational Information Bottleneck for Data Extraction Based on Mutual Information in Internet of Vehicles | Cui Zhang et.al. | 2409.17287 | null |
2024-09-25 | Streaming Neural Images | Marcos V. Conde et.al. | 2409.17134 | null |
2024-09-25 | PhD Forum: Efficient Privacy-Preserving Processing via Memory-Centric Computing | Mpoki Mwaisela et.al. | 2409.16777 | null |
2024-09-25 | The Effect of Lossy Compression on 3D Medical Images Segmentation with Deep Learning | Anvar Kurmukov et.al. | 2409.16733 | null |
2024-09-24 | AIM 2024 Challenge on UHD Blind Photo Quality Assessment | Vlad Hosu et.al. | 2409.16271 | null |
2024-09-25 | COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models | Kehui Liu et.al. | 2409.15146 | link |
2024-09-23 | AlphaZip: Neural Network-Enhanced Lossless Text Compression | Swathi Shree Narashiman et.al. | 2409.15046 | link |
2024-09-23 | Anomaly Detection from a Tensor Train Perspective | Alejandro Mata Ali et.al. | 2409.15030 | null |
2024-09-23 | AIM 2024 Challenge on Video Saliency Prediction: Methods and Results | Andrey Moskalenko et.al. | 2409.14827 | link |
2024-09-21 | Window-based Channel Attention for Wavelet-enhanced Learned Image Compression | Heng Xu et.al. | 2409.14090 | null |
2024-09-20 | Reduced bit median quantization: A middle process for Efficient Image Compression | Fikresilase Wondmeneh Abebayew et.al. | 2409.13789 | null |
2024-09-20 | Data Compression using Rank-1 Lattices for Parameter Estimation in Machine Learning | Michael Gnewuch et.al. | 2409.13453 | null |
2024-09-19 | Breaking the Barriers of One-to-One Usage of Implicit Neural Representation in Image Compression: A Linear Combination Approach with Performance Guarantees | Sai Sanjeet et.al. | 2409.13117 | link |
2024-09-19 | Optimal Coding for Randomized Kolmogorov Complexity and Its Applications | Shuichi Hirahara et.al. | 2409.12744 | null |
2024-09-19 | Multi-Scale Feature Prediction with Auxiliary-Info for Neural Image Compression | Chajin Shin et.al. | 2409.12719 | null |
2024-09-18 | One Map to Find Them All: Real-time Open-Vocabulary Mapping for Zero-shot Multi-Object Navigation | Finn Lukas Busch et.al. | 2409.11764 | null |
2024-09-18 | LFIC-DRASC: Deep Light Field Image Compression Using Disentangled Representation and Asymmetrical Strip Convolution | Shiyu Feng et.al. | 2409.11711 | null |
2024-09-18 | k-mer-based approaches to bridging pangenomics and population genetics | Miles D. Roberts et.al. | 2409.11683 | null |
2024-09-17 | Few-Shot Domain Adaptation for Learned Image Compression | Tianyu Zhang et.al. | 2409.11111 | null |
2024-09-17 | Edge-based Denoising Image Compression | Ryugo Morita et.al. | 2409.10978 | null |
2024-09-16 | Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning | Amin Karimi Monsefi et.al. | 2409.10362 | link |
2024-09-14 | Lossy Image Compression with Stochastic Quantization | Anton Kozyriev et.al. | 2409.09488 | null |
2024-09-13 | Fast DCT+: A Family of Fast Transforms Based on Rank-One Updates of the Path Graph | Samuel Fernández-Menduiña et.al. | 2409.08970 | null |
2024-09-13 | On the Computation of BD-Rate over a Set of Videos for Fair Assessment of Performance of Learned Video Codecs | M. Akin Yilmaz et.al. | 2409.08772 | null |
2024-09-13 | USTC-TD: A Test Dataset and Benchmark for Image and Video Coding in 2020s | Zhuoyuan Li et.al. | 2409.08481 | null |
2024-09-12 | Learned Compression for Images and Point Clouds | Mateen Ulhaq et.al. | 2409.08376 | link |
2024-09-11 | NVRC: Neural Video Representation Compression | Ho Man Kwan et.al. | 2409.07414 | null |
2024-09-11 | Dynamic Error-Bounded Hierarchical Matrices in Neural Network Compression | John Mango et.al. | 2409.07028 | null |
2024-09-10 | Universal End-to-End Neural Network for Lossy Image Compression | Bouzid Arezki et.al. | 2409.06586 | null |
2024-09-10 | Rate-Constrained Quantization for Communication-Efficient Federated Learning | Shayan Mohajer Hamidi et.al. | 2409.06319 | null |
2024-09-09 | Design and Implementation of TAO DAQ System | Shuihan Zhang et.al. | 2409.05522 | null |
2024-09-09 | A Taxonomy of Miscompressions: Preparing Image Forensics for Neural Compression | Nora Hofer et.al. | 2409.05490 | null |
2024-09-09 | Attention Based Machine Learning Methods for Data Reduction with Guaranteed Error Bounds | Xiao Li et.al. | 2409.05357 | null |
2024-09-06 | Convolutional Transformer-Based Image Compression | Bouzid Arezki et.al. | 2409.04118 | null |
2024-09-06 | 3D-GP-LMVIC: Learning-based Multi-View Image Coding with 3D Gaussian Geometric Priors | Yujun Huang et.al. | 2409.04013 | link |
2024-09-05 | TropNNC: Structured Neural Network Compression Using Tropical Geometry | Konstantinos Fotopoulos et.al. | 2409.03945 | null |
2024-09-05 | Unified Framework for Neural Network Compression via Decomposition and Optimal Rank Selection | Ali Aghababaei-Harandi et.al. | 2409.03555 | null |
2024-09-05 | Efficient Image Compression Using Advanced State Space Models | Bouzid Arezki et.al. | 2409.02743 | null |
2024-09-10 | FrameCorr: Adaptive, Autoencoder-based Neural Compression for Video Reconstruction in Resource and Timing Constrained Network Settings | John Li et.al. | 2409.02453 | null |
2024-09-03 | Compressed learning based onboard semantic compression for remote sensing platforms | Protim Bhattacharjee et.al. | 2409.01988 | link |
2024-09-03 | Map-Assisted Remote-Sensing Image Compression at Extremely Low Bitrates | Yixuan Ye et.al. | 2409.01935 | link |
2024-09-03 | Privacy-Preserving Multimedia Mobile Cloud Computing Using Protective Perturbation | Zhongze Tang et.al. | 2409.01710 | null |
2024-09-02 | Multi-Reference Generative Face Video Compression with Contrastive Learning | Goluck Konuko et.al. | 2409.01029 | link |
2024-09-02 | Accelerating block-level rate control for learned image compression | Muchen Dong et.al. | 2409.01009 | null |
2024-09-02 | PNVC: Towards Practical INR-based Video Compression | Ge Gao et.al. | 2409.00953 | null |
2024-09-01 | BWT construction and search at the terabase scale | Heng Li et.al. | 2409.00613 | link |
2024-08-30 | Prioritized Information Bottleneck Theoretic Framework with Distributed Online Learning for Edge Video Analytics | Zhengru Fang et.al. | 2409.00146 | link |
2024-08-28 | Quantum Kernel Principal Components Analysis for Compact Readout of Chemiresistive Sensor Arrays | Zeheng Wang et.al. | 2409.00115 | null |
2024-08-30 | NDP: Next Distribution Prediction as a More Broad Target | Junhao Ruan et.al. | 2408.17377 | null |
2024-08-30 | Approximately Invertible Neural Network for Learned Image Compression | Yanbo Gao et.al. | 2408.17073 | null |
2024-08-29 | UAV-Based Human Body Detector Selection and Fusion for Geolocated Saliency Map Generation | Piotr Rudol et.al. | 2408.16501 | null |
2024-08-29 | Convolutional Neural Network Compression Based on Low-Rank Decomposition | Yaping He et.al. | 2408.16289 | null |
2024-08-27 | Bandwidth-Aware and Overlap-Weighted Compression for Communication-Efficient Federated Learning | Zichen Tang et.al. | 2408.14736 | null |
2024-08-25 | Condensed Sample-Guided Model Inversion for Knowledge Distillation | Kuluhan Binici et.al. | 2408.13850 | null |
2024-08-12 | Semantic Variational Bayes Based on a Semantic Information Theory for Solving Latent Variables | Chenguang Lu et.al. | 2408.13122 | null |
2024-08-22 | Quantization-free Lossy Image Compression Using Integer Matrix Factorization | Pooya Ashtari et.al. | 2408.12691 | link |
2024-08-22 | DeepHQ: Learned Hierarchical Quantizer for Progressive Deep Image Coding | Jooyoung Lee et.al. | 2408.12150 | null |
2024-08-28 | AIM 2024 Challenge on Compressed Video Quality Assessment: Methods and Results | Maksim Smirnov et.al. | 2408.11982 | link |
2024-08-20 | Trustworthy Compression? Impact of AI-based Codecs on Biometrics for Law Enforcement | Sandra Bergmann et.al. | 2408.10823 | null |
2024-08-20 | Diff-PCC: Diffusion-based Neural Compression for 3D Point Clouds | Kai Liu et.al. | 2408.10543 | null |
2024-08-16 | LLM-PCGC: Large Language Model-based Point Cloud Geometry Compression | Yuqi Ye et.al. | 2408.08682 | null |
2024-08-16 | Bi-Directional Deep Contextual Video Compression | Xihua Sheng et.al. | 2408.08604 | null |
2024-08-16 | Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs | Jinming Liu et.al. | 2408.08575 | null |
2024-08-15 | Algebraic Vertex Ordering of a Sparse Graph for Adjacency Access Locality and Graph Compression | Dimitris Floros et.al. | 2408.08439 | null |
2024-08-15 | When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding | Pingping Zhang et.al. | 2408.08093 | null |
2024-08-15 | DM2RM: Dual-Mode Multimodal Ranking for Target Objects and Receptacles Based on Open-Vocabulary Instructions | Ryosuke Korekata et.al. | 2408.07910 | null |
2024-08-14 | Towards Real-time Video Compressive Sensing on Mobile Devices | Miao Cao et.al. | 2408.07530 | link |
2024-08-14 | Encoding and Decoding Algorithms of ANS Variants and Evaluation of Their Average Code Lengths | Hirosuke Yamamoto et.al. | 2408.07322 | null |
2024-08-13 | Subjective and Objective Quality Assessment of Rendered Human Avatar Videos in Virtual Reality | Yu-Chih Chen et.al. | 2408.07041 | null |
2024-08-13 | Feature-Preserving Rate-Distortion Optimization in Image Coding for Machines | Samuel Fernández Menduiña et.al. | 2408.07028 | null |
2024-08-19 | Joint Source-Channel Optimization for UAV Video Coding and Transmission | Kesong Wu et.al. | 2408.06667 | null |
2024-08-08 | Flow-Lenia.png: Evolving Multi-Scale Complexity by Means of Compression | Tadashi Adachi et.al. | 2408.06374 | null |
2024-08-09 | Benchmarking Conventional and Learned Video Codecs with a Low-Delay Configuration | Siyue Teng et.al. | 2408.05042 | null |
2024-08-08 | SG-JND: Semantic-Guided Just Noticeable Distortion Predictor For Image Compression | Linhan Cao et.al. | 2408.04273 | null |
2024-08-07 | Bi-Level Spatial and Channel-aware Transformer for Learned Image Compression | Hamidreza Soltani et.al. | 2408.03842 | null |
2024-08-07 | BVI-AOM: A New Training Dataset for Deep Video Compression Optimization | Jakub Nawała et.al. | 2408.03265 | link |
2024-08-06 | Enabling High-Throughput Parallel I/O in Particle-in-Cell Monte Carlo Simulations with openPMD and Darshan I/O Monitoring | Jeremy J. Williams et.al. | 2408.02869 | null |
2024-08-05 | Dimensionality Reduction and Nearest Neighbors for Improving Out-of-Distribution Detection in Medical Image Segmentation | McKell Woodland et.al. | 2408.02761 | link |
2024-08-04 | CACE-Net: Co-guidance Attention and Contrastive Enhancement for Effective Audio-Visual Event Localization | Xiang He et.al. | 2408.01952 | link |
2024-08-03 | Channel-Aware Distributed Transmission Control and Video Streaming in UAV Networks | Masoud Ghazikor et.al. | 2408.01885 | null |
2024-08-02 | An Adaptive Tensor-Train Decomposition Approach for Efficient Deep Neural Network Compression | Shiyi Luo et.al. | 2408.01534 | null |
2024-07-31 | Exploiting Change Blindness for Video Coding: Perspectives from a Less Promising User Study | Mitra Amiri et.al. | 2408.00052 | null |
2024-07-31 | Tora: Trajectory-oriented Diffusion Transformer for Video Generation | Zhenghao Zhang et.al. | 2407.21705 | link |
2024-07-30 | Edge Learning Based Collaborative Automatic Modulation Classification for Hierarchical Cognitive Radio Networks | Peihao Dong et.al. | 2407.20772 | link |
2024-07-30 | Understanding the Impact of Synchronous, Asynchronous, and Hybrid In-Situ Techniques in Computational Fluid Dynamics Applications | Yi Ju et.al. | 2407.20717 | null |
2024-07-29 | Homomorphic data compression for real time photon correlation analysis | Sebastian Strempfer et.al. | 2407.20356 | null |
2024-07-24 | Accelerating the Low-Rank Decomposed Models | Habib Hajimolahoseini et.al. | 2407.20266 | null |
2024-07-29 | ComNeck: Bridging Compressed Image Latents and Multimodal LLMs via Universal Transform-Neck | Chia-Hao Kao et.al. | 2407.19651 | null |
2024-07-28 | NVC-1B: A Large Neural Video Coding Model | Xihua Sheng et.al. | 2407.19402 | null |
2024-07-18 | Generative AI Augmented Induction-based Formal Verification | Aman Kumar et.al. | 2407.18965 | null |
2024-07-25 | The seismic purifier: An unsupervised approach to seismic signal detection via representation learning | Onur Efe et.al. | 2407.18402 | link |
2024-07-25 | Adaptable Deep Joint Source-and-Channel Coding for Small Satellite Applications | Olga Kondrateva et.al. | 2407.18146 | null |
2024-07-25 | Scaling Training Data with Lossy Image Compression | Katherine L. Mentzer et.al. | 2407.17954 | link |
2024-07-25 | Towards the Spectral bias Alleviation by Normalizations in Coordinate Networks | Zhicheng Cai et.al. | 2407.17834 | link |
2024-07-24 | Lossy Data Compression By Adaptive Mesh Coarsening | N. Böing et.al. | 2407.17316 | null |
2024-07-24 | High Efficiency Image Compression for Large Visual-Language Models | Binzhe Li et.al. | 2407.17060 | null |
2024-07-23 | Accelerating Learned Video Compression via Low-Resolution Representation Learning | Zidian Qiu et.al. | 2407.16418 | null |
2024-07-24 | FCNR: Fast Compressive Neural Representation of Visualization Images | Yunfei Lu et.al. | 2407.16369 | link |
2024-07-19 | Shapley Pruning for Neural Network Compression | Kamil Adamczewski et.al. | 2407.15875 | null |
2024-07-18 | CIC: Circular Image Compression | Honggui Li et.al. | 2407.15870 | null |
2024-07-22 | Online String Attractors | Philip Whittington et.al. | 2407.15599 | null |
2024-07-22 | Spectral properties of bright deposits in permanently shadowed craters on Ceres | Stefan Schröder et.al. | 2407.15327 | null |
2024-07-21 | Lessons Learned on the Path to Guaranteeing the Error Bound in Lossy Quantizers | Alex Fallin et.al. | 2407.15037 | null |
2024-07-19 | A Benchmark for Gaussian Splatting Compression and Quality Assessment Study | Qi Yang et.al. | 2407.14197 | link |
2024-07-18 | Training Foundation Models as Data Compression: On Information, Model Weights and Copyright Law | Giorgio Franceschelli et.al. | 2407.13493 | null |
2024-07-18 | Learned HDR Image Compression for Perceptually Optimal Storage and Display | Peibei Cao et.al. | 2407.13179 | null |
2024-07-17 | High Frequency Matters: Uncertainty Guided Image Compression with Wavelet Diffusion | Juan Song et.al. | 2407.12538 | link |
2024-07-17 | Enhancing Film Grain Coding in VVC: Improving Encoding Quality and Efficiency | Vignesh V Menon et.al. | 2407.12465 | null |
2024-07-17 | Reliability Function of Classical-Quantum Channels | Ke Li et.al. | 2407.12403 | null |
2024-07-17 | Exploiting Inter-Image Similarity Prior for Low-Bitrate Remote Sensing Image Compression | Junhui Li et.al. | 2407.12295 | null |
2024-07-16 | Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors | Matt Gorbett et.al. | 2407.12075 | null |
2024-07-17 | Rate-Distortion-Cognition Controllable Versatile Neural Image Compression | Jinming Liu et.al. | 2407.11700 | null |
2024-07-16 | MINI-LLM: Memory-Efficient Structured Pruning for Large Language Models | Hongrong Cheng et.al. | 2407.11681 | null |
2024-07-17 | Neural Compression of Atmospheric States | Piotr Mirowski et.al. | 2407.11666 | null |
2024-07-16 | Rethinking Learned Image Compression: Context is All You Need | Jixiang Luo et.al. | 2407.11590 | null |
2024-07-16 | The impact of lossy data compression on the power spectrum of the high redshift 21-cm signal with LOFAR | J. K. Chege et.al. | 2407.11557 | null |
2024-07-21 | Uniformly Accelerated Motion Model for Inter Prediction | Zhuoyuan Li et.al. | 2407.11541 | null |
2024-07-15 | M18K: A Comprehensive RGB-D Dataset and Benchmark for Mushroom Detection and Instance Segmentation | Abdollah Zakeri et.al. | 2407.11275 | link |
2024-07-15 | Enhancing Electrocardiogram Signal Analysis Using NLP-Inspired Techniques: A Novel Approach with Embedding and Self-Attention | Prapti Ganguly et.al. | 2407.11102 | null |
2024-07-15 | In-Loop Filtering via Trained Look-Up Tables | Zhuoyuan Li et.al. | 2407.10926 | null |
2024-07-15 | Bidirectional Stereo Image Compression with Cross-Dimensional Entropy Model | Zhening Liu et.al. | 2407.10632 | link |
2024-07-14 | UMI on Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers | Huy Ha et.al. | 2407.10353 | null |
2024-07-13 | WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model | Haisheng Fu et.al. | 2407.09983 | null |
2024-07-13 | Zero-Shot Image Compression with Diffusion-Based Posterior Sampling | Noam Elata et.al. | 2407.09896 | link |
2024-07-13 | Image Compression for Machine and Human Vision with Spatial-Frequency Adaptation | Han Li et.al. | 2407.09853 | link |
2024-07-13 | Infinite families of optimal and minimal codes over rings using simplicial complexes | Yanan Wu et.al. | 2407.09783 | null |
2024-07-12 | HPC: Hierarchical Progressive Coding Framework for Volumetric Video | Zihan Zheng et.al. | 2407.09026 | null |
2024-07-12 | Hybrid Temporal Computing for Lower Power Hardware Accelerators | Maliha Tasnim et.al. | 2407.08975 | null |
2024-07-11 | Manipulating a Tetris-Inspired 3D Video Representation | Mihir Godbole et.al. | 2407.08885 | null |
2024-07-11 | OMR-NET: a two-stage octave multi-scale residual network for screen content image compression | Shiqi Jiang et.al. | 2407.08545 | null |
2024-07-11 | CADC: Encoding User-Item Interactions for Compressing Recommendation Model Training Data | Hossein Entezari Zarch et.al. | 2407.08108 | null |
2024-07-10 | Using Low-Discrepancy Points for Data Compression in Machine Learning: An Experimental Comparison | Simone Göttlich et.al. | 2407.07450 | null |
2024-07-10 | Standard compliant video coding using low complexity, switchable neural wrappers | Yueyu Hu et.al. | 2407.07395 | null |
2024-07-10 | MNeRV: A Multilayer Neural Representation for Videos | Qingling Chang et.al. | 2407.07347 | link |
2024-07-11 | Entropy Law: The Story Behind Data Compression and LLM Performance | Mingjia Yin et.al. | 2407.06645 | link |
2024-07-08 | A Hybrid Algorithm for Computing a Partial Singular Value Decomposition Satisfying a Given Threshold | James Baglama et.al. | 2407.06306 | link |
2024-07-08 | TAPVid-3D: A Benchmark for Tracking Any Point in 3D | Skanda Koppula et.al. | 2407.05921 | link |
2024-07-05 | The Impact of Quantization and Pruning on Deep Reinforcement Learning Models | Heng Lu et.al. | 2407.04803 | null |
2024-07-05 | An autoencoder for compressing angle-resolved photoemission spectroscopy data | Steinn Ymir Agustsson et.al. | 2407.04631 | link |
2024-07-05 | Rethinking Image Compression on the Web with Generative AI | Shayan Ali Hassan et.al. | 2407.04542 | null |
2024-07-11 | A High-Quality Workflow for Multi-Resolution Scientific Data Reduction and Visualization | Daoce Wang et.al. | 2407.04267 | null |
2024-07-04 | Autoencoded Image Compression for Secure and Fast Transmission | Aryan Kashyap Naveen et.al. | 2407.03990 | link |
2024-07-03 | Value-Penalized Auxiliary Control from Examples for Learning without Rewards or Demonstrations | Trevor Ablett et.al. | 2407.03311 | link |
2024-07-03 | KeyVideoLLM: Towards Large-scale Video Keyframe Selection | Hao Liang et.al. | 2407.03104 | null |
2024-07-01 | Statistical Analysis of ZFP: Understanding Bias | Alyson Fox et.al. | 2407.01826 | null |
2024-07-01 | An AI-based, Error-bounded Compression Scheme for High-frequency Power Quality Disturbance Data | Markus Stroot et.al. | 2407.01112 | null |
2024-06-28 | Wavelets Are All You Need for Autoregressive Image Generation | Wael Mattar et.al. | 2406.19997 | null |
2024-06-28 | Optimal Video Compression using Pixel Shift Tracking | Hitesh Saai Mananchery Panneerselvam et.al. | 2406.19630 | link |
2024-06-27 | MCNC: Manifold Constrained Network Compression | Chayne Thrash et.al. | 2406.19301 | null |
2024-06-27 | Staggered Quantizers for Perfect Perceptual Quality: A Connection between Quantizers with Common Randomness and Without | Ruida Zhou et.al. | 2406.19248 | null |
2024-06-25 | Asymptotically Minimax Regret by Bayes Mixtures | Jun'ichi Takeuchi et.al. | 2406.17929 | null |
2024-06-24 | Hierarchical B-frame Video Coding for Long Group of Pictures | Ivan Kirillov et.al. | 2406.16544 | null |
2024-06-20 | Ranking LLMs by compression | Peijia Guo et.al. | 2406.14171 | null |
2024-06-21 | Measuring Sample Importance in Data Pruning for Training LLMs from a Data Compression Perspective | Minsang Kim et.al. | 2406.14124 | null |
2024-06-20 | Prediction and Reference Quality Adaptation for Learned Video Compression | Xihua Sheng et.al. | 2406.14118 | null |
2024-06-19 | Convex-hull Estimation using XPSNR for Versatile Video Coding | Vignesh V Menon et.al. | 2406.13712 | null |
2024-06-19 | A Study on the Effect of Color Spaces in Learned Image Compression | Srivatsa Prativadibhayankaram et.al. | 2406.13709 | null |
2024-06-19 | Stability and Generalizability in SDE Diffusion Models with Measure-Preserving Dynamics | Weitong Zhang et.al. | 2406.13652 | null |
2024-06-18 | Learned Image Compression for HE-stained Histopathological Images via Stain Deconvolution | Maximilian Fischer et.al. | 2406.12623 | null |
2024-06-18 | Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines | Honglei Zhang et.al. | 2406.12367 | null |
2024-06-15 | How Should We Extract Discrete Audio Tokens from Self-Supervised Models? | Pooneh Mousavi et.al. | 2406.10735 | null |
2024-06-15 | Object-Attribute-Relation Representation based Video Semantic Communication | Qiyuan Du et.al. | 2406.10469 | null |
2024-06-14 | On Efficient Neural Network Architectures for Image Compression | Yichi Zhang et.al. | 2406.10361 | link |
2024-06-14 | Information Compression in the AI Era: Recent Advances and Future Challenges | Jun Chen et.al. | 2406.10036 | null |
2024-06-13 | CMC-Bench: Towards a New Paradigm of Visual Signal Compression | Chunyi Li et.al. | 2406.09356 | link |
2024-06-13 | Neural NeRF Compression | Tuan Pham et.al. | 2406.08943 | null |
2024-06-14 | Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models | Yi-Fan Zhang et.al. | 2406.08487 | link |
2024-06-12 | On Annotation-free Optimization of Video Coding for Machines | Marc Windsheimer et.al. | 2406.07938 | null |
2024-06-11 | SSNVC: Single Stream Neural Video Compression with Implicit Temporal Information | Feng Wang et.al. | 2406.07645 | null |
2024-06-11 | Image and Video Tokenization with Binary Spherical Quantization | Yue Zhao et.al. | 2406.07548 | link |
2024-06-11 | Optimal Matrix-Mimetic Tensor Algebras via Variable Projection | Elizabeth Newman et.al. | 2406.06942 | link |
2024-06-10 | Deep Generative Modeling Reshapes Compression and Transmission: From Efficiency to Resiliency | Jincheng Dai et.al. | 2406.06446 | null |
2024-06-10 | Image Compression with Isotropic and Anisotropic Shepard Inpainting | Rahul Mohideen Kaja Mohideen et.al. | 2406.06247 | null |
2024-06-10 | Efficient Neural Compression with Inference-time Decoding | C. Metz et.al. | 2406.06237 | null |
2024-06-10 | Fiducial-Cosmology-dependent systematics for the DESI 2024 BAO Analysis | A. Pérez-Fernández et.al. | 2406.06085 | null |
2024-06-10 | Quantum Sparse Coding and Decoding Based on Quantum Network | Xun Ji et.al. | 2406.06012 | null |
2024-06-09 | Region of Interest Loss for Anonymizing Learned Image Compression | Christoph Liebender et.al. | 2406.05726 | link |
2024-06-08 | Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models | Minho Park et.al. | 2406.05432 | link |
2024-06-07 | PatchSVD: A Non-uniform SVD-based Image Compression Algorithm | Zahra Golpayegani et.al. | 2406.05129 | link |
2024-06-07 | SMC++: Masked Learning of Unsupervised Video Semantic Compression | Yuan Tian et.al. | 2406.04765 | link |
2024-06-06 | LDM-RSIC: Exploring Distortion Prior with Latent Diffusion Models for Remote Sensing Image Compression | Junhui Li et.al. | 2406.03961 | link |
2024-06-05 | Lossless Image Compression Using Multi-level Dictionaries: Binary Images | Samar Agnihotri et.al. | 2406.03087 | null |
2024-06-05 | On Jacob Ziv's Individual-Sequence Approach to Information Theory | Neri Merhav et.al. | 2406.02904 | null |
2024-06-04 | Towards AI-Assisted Sustainable Adaptive Video Streaming Systems: Tutorial and Survey | Reza Farahani et.al. | 2406.02302 | null |
2024-06-03 | Video Coding with Cross-Component Sample Offset | Han Gao et.al. | 2406.01795 | null |
2024-06-05 | Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption | Anqi Li et.al. | 2406.00758 | link |
2024-06-01 | Efficient Massive Black Hole Binary parameter estimation for LISA using Sequential Neural Likelihood | Iván Martín Vílchez et.al. | 2406.00565 | null |
2024-06-01 | A Review of Pulse-Coupled Neural Network Applications in Computer Vision and Image Processing | Nurul Rafi et.al. | 2406.00239 | null |
2024-05-31 | ContextGS: Compact 3D Gaussian Splatting with Anchor Level Context Model | Yufei Wang et.al. | 2405.20721 | link |
2024-05-30 | Quantum encoder for fixed Hamming-weight subspaces | Renato M. S. Farias et.al. | 2405.20408 | null |
2024-05-29 | Implicit Neural Image Field for Biological Microscopy Image Compression | Gaole Dai et.al. | 2405.19012 | link |
2024-05-28 | Deep Network Pruning: A Comparative Study on CNNs in Face Recognition | Fernando Alonso-Fernandez et.al. | 2405.18302 | null |
2024-05-28 | Channel Reciprocity Based Attack Detection for Securing UWB Ranging by Autoencoder | Wenlong Gou et.al. | 2405.18255 | null |
2024-05-27 | Evaluation of Resource-Efficient Crater Detectors on Embedded Systems | Simon Vellas et.al. | 2405.16953 | link |
2024-05-27 | UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation | Runzhao Yang et.al. | 2405.16850 | null |
2024-05-27 | Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model | Shoma Iwai et.al. | 2405.16817 | link |
2024-05-25 | N-BVH: Neural ray queries with bounding volume hierarchies | Philippe Weier et.al. | 2405.16237 | link |
2024-05-25 | A 7K Parameter Model for Underwater Image Enhancement based on Transmission Map Prior | Fuheng Zhou et.al. | 2405.16197 | link |
2024-05-24 | Analytical proxy to families of numerical solutions: the case study of spherical mini-boson stars | Jianzhi Yang et.al. | 2405.15651 | null |
2024-05-24 | SATSense: Multi-Satellite Collaborative Framework for Spectrum Sensing | Haoxuan Yuan et.al. | 2405.15542 | null |
2024-05-24 | Meta-meshing and triangulating lattice structures at a large scale | Qiang Zou et.al. | 2405.15197 | null |
2024-05-23 | NeCGS: Neural Compression for 3D Geometry Sets | Siyu Ren et.al. | 2405.15034 | link |
2024-05-23 | An augmented Lagrangian trust-region method with inexact gradient evaluations to accelerate constrained optimization problems using model hyperreduction | Tianshu Wen et.al. | 2405.14827 | null |
2024-05-23 | Motion-based video compression for resource-constrained camera traps | Malika Nisal Ratnayake et.al. | 2405.14419 | null |
2024-06-01 | I |
Meiqin Liu et.al. | 2405.14336 | link |
2024-05-23 | Sparse |
Matthias Chung et.al. | 2405.14270 | null |
2024-05-22 | "Turing Tests" For An AI Scientist | Xiaoxin Yin et.al. | 2405.13352 | null |
2024-05-21 | Efficient Learned Wavelet Image and Video Coding | Anna Meyer et.al. | 2405.12631 | null |
2024-05-24 | Accelerating Relative Entropy Coding with Space Partitioning | Jiajun He et.al. | 2405.12203 | null |
2024-05-20 | Refining Coded Image in Human Vision Layer Using CNN-Based Post-Processing | Takahiro Shindo et.al. | 2405.11894 | null |
2024-05-19 | Effective In-Context Example Selection through Data Compression | Zhongxiang Sun et.al. | 2405.11465 | null |
2024-05-18 | InfRS: Incremental Few-Shot Object Detection in Remote Sensing Images | Wuzhou Li et.al. | 2405.11293 | link |
2024-05-17 | Dark Energy Survey Year 3 results: simulation-based cosmological inference with wavelet harmonics, scattering transforms, and moments of weak lensing mass maps II. Cosmological results | M. Gatti et.al. | 2405.10881 | null |
2024-05-17 | Reduced storage direct tensor ring decomposition for convolutional neural networks compression | Mateusz Gabor et.al. | 2405.10802 | link |
2024-05-17 | Enhancing Perception Quality in Remote Sensing Image Compression via Invertible Neural Network | Junhui Li et.al. | 2405.10518 | null |
2024-05-15 | Properties that allow or prohibit transferability of adversarial attacks among quantized networks | Abhishek Shrestha et.al. | 2405.09598 | link |
2024-05-15 | Sensitivity Decouple Learning for Image Compression Artifacts Reduction | Li Ma et.al. | 2405.09291 | null |
2024-05-18 | Scalable Image Coding for Humans and Machines Using Feature Fusion Network | Takahiro Shindo et.al. | 2405.09152 | link |
2024-05-14 | Parameter-Efficient Instance-Adaptive Neural Video Compression | Hyunmo Yang et.al. | 2405.08530 | link |
2024-05-13 | Goal-oriented compression for |
Yifei Sun et.al. | 2405.07808 | null |
2024-05-13 | Neural Network Compression for Reinforcement Learning Tasks | Dmitry A. Ivanov et.al. | 2405.07748 | null |
2024-05-13 | On the Adversarial Robustness of Learning-based Image Compression Against Rate-Distortion Attacks | Chenhao Wu et.al. | 2405.07717 | null |
2024-05-21 | An Efficient Compression Method for Sign Information of DCT Coefficients via Sign Retrieval | Chihiro Tsutake et.al. | 2405.07487 | link |
2024-05-10 | Time-of-arrival Estimation and Phase Unwrapping of Head-related Transfer Functions With Integer Linear Programming | Chin-Yun Yu et.al. | 2405.06804 | link |
2024-05-08 | Urban Boundary Delineation from Commuting Data with Bayesian Stochastic Blockmodeling: Scale, Contiguity, and Hierarchy | Sebastian Morel-Balbi et.al. | 2405.04911 | link |
2024-05-14 | Some Notes on the Sample Complexity of Approximate Channel Simulation | Gergely Flamich et.al. | 2405.04363 | null |
2024-05-07 | Group-aware Parameter-efficient Updating for Content-Adaptive Neural Video Compression | Zhenghao Chen et.al. | 2405.04274 | null |
2024-05-08 | Verified Neural Compressed Sensing | Rudy Bunel et.al. | 2405.04260 | null |
2024-05-15 | Lossy Compression with Data, Perception, and Classification Constraints | Yuhan Wang et.al. | 2405.04144 | link |
2024-05-07 | DMOFC: Discrimination Metric-Optimized Feature Compression | Changsheng Gao et.al. | 2405.04044 | null |
2024-05-06 | Computational ghost imaging with hybrid transforms by integrating Hadamard, discrete cosine, and Haar matrices | Yi-Ning Zhao et.al. | 2405.03729 | null |
2024-05-06 | A Rate-Distortion-Classification Approach for Lossy Image Compression | Yuefeng Zhang et.al. | 2405.03500 | null |
2024-05-06 | Structure-Preserving Network Compression Via Low-Rank Induced Training Through Linear Layers Composition | Xitong Zhang et.al. | 2405.03089 | link |
2024-05-04 | Deep Pulse-Signal Magnification for remote Heart Rate Estimation in Compressed Videos | Joaquim Comas et.al. | 2405.02652 | null |
2024-05-06 | Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design | Jian Meng et.al. | 2405.01775 | link |
2024-05-02 | PointCompress3D -- A Point Cloud Compression Framework for Roadside LiDARs in Intelligent Transportation Systems | Walter Zimmer et.al. | 2405.01750 | null |
2024-04-28 | Lightweight Conceptual Dictionary Learning for Text Classification Using Information Compression | Li Wan et.al. | 2405.01584 | null |
2024-05-02 | GroupedMixer: An Entropy Model with Group-wise Token-Mixers for Learned Image Compression | Daxin Li et.al. | 2405.01170 | null |
2024-04-30 | Analysis and Enhancement of Lossless Image Compression in JPEG-XL | Rustam Mamedov et.al. | 2404.19755 | null |
2024-04-30 | EfficientASR: Speech Recognition Network Compression via Attention Redundancy and Chunk-Level FFN Optimization | Jianzong Wang et.al. | 2404.19214 | null |
2024-04-29 | Towards Extreme Image Compression with Latent Feature Guidance and Diffusion Prior | Zhiyuan Li et.al. | 2404.18820 | link |
2024-04-28 | Joint Reference Frame Synthesis and Post Filter Enhancement for Versatile Video Coding | Weijie Bao et.al. | 2404.18058 | null |
2024-04-25 | Learning Visuotactile Skills with Two Multifingered Hands | Toru Lin et.al. | 2404.16823 | link |
2024-04-24 | Domain Adaptation for Learned Image Compression with Supervised Adapters | Alberto Presta et.al. | 2404.15591 | link |
2024-04-23 | One-Pass Randomized Algorithm with Practical Rangefinder for Low-Rank Approximation to Quaternion Matrices | Chao Chang et.al. | 2404.14783 | link |
2024-04-22 | Neural Compress-and-Forward for the Relay Channel | Ezgi Ozyilkan et.al. | 2404.14594 | null |
2024-04-22 | Taming Server Memory TCO with Multiple Software-Defined Compressed Tiers | Sandeep Kumar et.al. | 2404.13886 | null |
2024-04-20 | HybridFlow: Infusing Continuity into Masked Codebook for Extreme Low-Bitrate Image Compression | Lei Lu et.al. | 2404.13372 | null |
2024-04-18 | Image Compression and Reconstruction Based on Quantum Network | Xun Ji et.al. | 2404.11994 | null |
2024-04-17 | Spatio-Temporal Motion Retargeting for Quadruped Robots | Taerim Yoon et.al. | 2404.11557 | null |
2024-04-17 | Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems | Luca Bompani et.al. | 2404.11488 | link |
2024-04-17 | Image Generative Semantic Communication with Multi-Modal Similarity Estimation for Resource-Limited Networks | Eri Hosonuma et.al. | 2404.11280 | null |
2024-04-16 | Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning | Kyle Hsu et.al. | 2404.10282 | link |
2024-04-16 | Compressible and Searchable: AI-native Multi-Modal Retrieval System with Learned Image Compression | Jixiang Luo et.al. | 2404.10234 | null |
2024-04-15 | One-Click Upgrade from 2D to 3D: Sandwiched RGB-D Video Compression for Stereoscopic Teleconferencing | Yueyu Hu et.al. | 2404.09979 | null |
2024-04-15 | Quantization of Large Language Models with an Overdetermined Basis | Daniil Merkulov et.al. | 2404.09737 | null |
2024-04-18 | Post-Training Network Compression for 3D Medical Image Segmentation: Reducing Computational Efforts via Tucker Decomposition | Tobias Weber et.al. | 2404.09683 | link |
2024-04-15 | MarsQE: Semantic-Informed Quality Enhancement for Compressed Martian Image | Chengfeng Liu et.al. | 2404.09433 | null |
2024-04-17 | Incremental data compression for PDE-constrained optimization with a data assimilation application | Xuejian Li et.al. | 2404.09323 | null |
2024-04-14 | A Joint Data Compression and Time-Delay Estimation Method For Distributed Systems via Extremum Encoding | Amir Weiss et.al. | 2404.09244 | null |
2024-04-12 | Lossy Image Compression with Foundation Diffusion Models | Lucas Relic et.al. | 2404.08580 | null |
2024-04-12 | Mitigating Challenges of the Space Environment for Onboard Artificial Intelligence: Design Overview of the Imaging Payload on SpIRIT | Miguel Ortiz del Castillo et.al. | 2404.08399 | null |
2024-04-11 | Video Compression Beyond VVC: Quantitative Analysis of Intra Coding Tools in Enhanced Compression Model (ECM) | Mohsen Abdoli et.al. | 2404.07872 | null |
2024-04-11 | Learning to Classify New Foods Incrementally Via Compressed Exemplars | Justin Yang et.al. | 2404.07507 | null |
2024-04-14 | A comparison between Shapefit compression and Full-Modelling method with PyBird for DESI 2024 and beyond | Y. Lai et.al. | 2404.07283 | link |
2024-04-10 | Exploring Repetitiveness Measures for Two-Dimensional Strings | Giuseppe Romana et.al. | 2404.07030 | null |
2024-04-10 | Fine color guidance in diffusion models and its application to image compression at extremely low bitrates | Tom Bordin et.al. | 2404.06865 | null |
2024-04-09 | Encoder-Quantization-Motion-based Video Quality Metrics | Yixu Chen et.al. | 2404.06620 | null |
2024-04-09 | DiffHarmony: Latent Diffusion Model Meets Image Harmonization | Pengfei Zhou et.al. | 2404.06139 | link |
2024-04-09 | Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey | Feng Liang et.al. | 2404.06114 | null |
2024-04-09 | Image and Video Compression using Generative Sparse Representation with Fidelity Controls | Wei Jiang et.al. | 2404.06076 | null |
2024-04-07 | Correcting Diffusion-Based Perceptual Image Compression with Privileged End-to-End Decoder | Yiyang Ma et.al. | 2404.04916 | null |
2024-04-07 | Task-Aware Encoder Control for Deep Video Compression | Xingtong Ge et.al. | 2404.04848 | null |
2024-04-06 | Power-Efficient Image Storage: Leveraging Super Resolution Generative Adversarial Network for Sustainable Compression and Reduced Carbon Footprint | Ashok Mondal et.al. | 2404.04642 | null |
2024-04-05 | ClickDiffusion: Harnessing LLMs for Interactive Precise Image Editing | Alec Helbling et.al. | 2404.04376 | link |
2024-04-03 | Convolutional variational autoencoders for secure lossy image compression in remote sensing | Alessandro Giuliano et.al. | 2404.03696 | null |
2024-03-25 | RL for Consistency Models: Faster Reward Guided Text-to-Image Generation | Owen Oertell et.al. | 2404.03673 | link |
2024-04-04 | Training LLMs over Neurally Compressed Text | Brian Lester et.al. | 2404.03626 | null |
2024-04-04 | Leveraging Interpolation Models and Error Bounds for Verifiable Scientific Machine Learning | Tyler Chang et.al. | 2404.03586 | link |
2024-04-04 | Semantic Compression with Information Lattice Learning | Haizi Yu et.al. | 2404.03131 | null |
2024-04-01 | Accounting for contact network uncertainty in epidemic inferences with Approximate Bayesian Computation | Maxwell H. Wang et.al. | 2404.02924 | null |
2024-04-03 | Building test batteries based on analysing random number generator tests within the framework of algorithmic information theory | Boris Ryabko et.al. | 2404.02708 | null |
2024-04-03 | Optimizing traffic signs and lights visibility for the teleoperation of autonomous vehicles through ROI compression | I. Dror et.al. | 2404.02481 | null |
2024-04-03 | MOPAR: A Model Partitioning Framework for Deep Learning Inference Services on Serverless Platforms | Jiaang Duan et.al. | 2404.02445 | null |
2024-04-02 | NeRFCodec: Neural Feature Compression Meets Neural Radiance Fields for Memory-Efficient Scene Representation | Sicheng Li et.al. | 2404.02185 | null |
2024-04-01 | The Rate-Distortion-Perception Trade-off: The Role of Private Randomness | Yassine Hamdi et.al. | 2404.01111 | null |
2024-03-31 | Metric dimensions of generalized Sierpiński graphs over squares | Savari Prabhu et.al. | 2404.00771 | null |
2024-03-27 | Computationally and Memory-Efficient Robust Predictive Analytics Using Big Data | Daniel Menges et.al. | 2403.19721 | null |
2024-03-28 | RootInteractive tool for multidimensional statistical analysis, machine learning and analytical model validation | Marian Invanov et.al. | 2403.19330 | null |
2024-03-28 | Uncertainty-Aware Deep Video Compression with Ensembles | Wufei Ma et.al. | 2403.19158 | null |
2024-04-08 | Neural Embedding Compression For Efficient Multi-Task Earth Observation Modelling | Carlos Gomes et.al. | 2403.17886 | link |
2024-03-26 | Low-Latency Neural Stereo Streaming | Qiqi Hou et.al. | 2403.17879 | null |
2024-03-26 | Fully-fused Multi-Layer Perceptrons on Intel Data Center GPUs | Kai Yuan et.al. | 2403.17607 | link |
2024-03-25 | Neural Image Compression with Quantization Rectifier | Wei Luo et.al. | 2403.17236 | null |
2024-03-25 | Invertible Diffusion Models for Compressed Sensing | Bin Chen et.al. | 2403.17006 | link |
2024-03-25 | Virtual Cylindrical PET for Efficient DOI Image Reconstruction with Sub-millimetre Resolution | Francisco E Enríquez-Mier-y-Terán et.al. | 2403.16465 | null |
2024-03-25 | Impact of Video Compression Artifacts on Fisheye Camera Visual Perception Tasks | Madhumitha Sakthi et.al. | 2403.16338 | null |
2024-03-24 | Laplacian-guided Entropy Model in Neural Codec with Blur-dissipated Synthesis | Atefeh Khoshkhahtinat et.al. | 2403.16258 | null |
2024-03-23 | Understanding The Effectiveness of Lossy Compression in Machine Learning Training Sets | Robert Underwood et.al. | 2403.15953 | null |
2024-03-23 | Droplet shape representation using Fourier series and autoencoders | Mihir Durve et.al. | 2403.15797 | null |
2024-03-21 | S2LIC: Learned Image Compression with the SwinV2 Block, Adaptive Channel-wise and Global-inter Attention Context | Yongqiang Wang et.al. | 2403.14471 | link |
2024-03-21 | Tensor network compressibility of convolutional models | Sukhbinder Singh et.al. | 2403.14379 | null |
2024-03-26 | Powerful Lossy Compression for Noisy Images | Shilv Cai et.al. | 2403.14135 | null |
2024-03-20 | String attractors and bi-infinite words | Pierre Béaur et.al. | 2403.13449 | null |
2024-03-19 | Super-High-Fidelity Image Compression via Hierarchical-ROI and Adaptive Quantization | Jixiang Luo et.al. | 2403.13030 | null |
2024-03-19 | Privacy-Preserving Face Recognition Using Trainable Feature Subtraction | Yuxi Mi et.al. | 2403.12457 | link |
2024-03-19 | VQ-NeRV: A Vector Quantized Neural Representation for Videos | Yunjie Xu et.al. | 2403.12401 | link |
2024-03-18 | Encoding of linear kinetic plasma problems in quantum circuits via data compression | Ivan Novikau et.al. | 2403.11989 | null |
2024-03-18 | Object Segmentation-Assisted Inter Prediction for Versatile Video Coding | Zhuoyuan Li et.al. | 2403.11694 | null |
2024-03-18 | Overfitted image coding at reduced complexity | Théophile Blard et.al. | 2403.11651 | link |
2024-03-18 | Hierarchical Frequency-based Upsampling and Refining for Compressed Video Quality Enhancement | Qianyu Zhang et.al. | 2403.11556 | null |
2024-03-18 | Earth+: on-board satellite imagery compression leveraging historical earth observations | Kuntai Du et.al. | 2403.11434 | null |
2024-03-17 | Fidelity-preserving Learning-Based Image Compression: Loss Function and Subjective Evaluation Methodology | Shima Mohammadi et.al. | 2403.11241 | link |
2024-03-16 | Channel-wise Feature Decorrelation for Enhanced Learned Image Compression | Farhad Pakdaman et.al. | 2403.10936 | null |
2024-03-16 | NARRATE: Versatile Language Architecture for Optimal Control in Robotics | Seif Ismail et.al. | 2403.10762 | link |
2024-03-15 | Process-and-Forward: Deep Joint Source-Channel Coding Over Cooperative Relay Networks | Chenghong Bian et.al. | 2403.10613 | null |
2024-03-15 | CPGA: Coding Priors-Guided Aggregation Network for Compressed Video Quality Enhancement | Qiang Zhu et.al. | 2403.10362 | link |
2024-03-15 | Interactive Distance Field Mapping and Planning to Enable Human-Robot Collaboration | Usama Ali et.al. | 2403.09988 | link |
2024-03-14 | SketchINR: A First Look into Sketches as Implicit Neural Representations | Hmrishav Bandyopadhyay et.al. | 2403.09344 | link |
2024-03-14 | Noise Dimension of GAN: An Image Compression Perspective | Ziran Zhu et.al. | 2403.09196 | null |
2024-03-20 | Content-aware Masked Image Modeling Transformer for Stereo Image Compression | Xinjie Zhang et.al. | 2403.08505 | link |
2024-03-12 | Approaching Rate-Distortion Limits in Neural Compression with Lattice Transform Coding | Eric Lei et.al. | 2403.07320 | null |
2024-03-11 | Grid Monitoring and Protection with Continuous Point-on-Wave Measurements and Generative AI | Lang Tong et.al. | 2403.06942 | null |
2024-03-16 | Enhancing Adversarial Training with Prior Knowledge Distillation for Robust Image Compression | Zhi Cao et.al. | 2403.06700 | null |
2024-03-13 | FSViewFusion: Few-Shots View Generation of Novel Objects | Rukhshanda Hussain et.al. | 2403.06394 | null |
2024-03-10 | Probing Image Compression For Class-Incremental Learning | Justin Yang et.al. | 2403.06288 | null |
2024-03-10 | Blockchain-Enabled Variational Information Bottleneck for IoT Networks | Qiong Wu et.al. | 2403.06129 | link |
2024-03-09 | Wavelet-Like Transform-Based Technology in Response to the Call for Proposals on Neural Network-Based Image Coding | Cunhui Dong et.al. | 2403.05937 | null |
2024-03-07 | Complexity-constrained quantum thermodynamics | Anthony Munson et.al. | 2403.04828 | null |
2024-03-07 | Image Coding for Machines with Edge Information Learning Using Segment Anything | Takahiro Shindo et.al. | 2403.04173 | link |
2024-03-06 | 3D Diffusion Policy | Yanjie Ze et.al. | 2403.03954 | link |
2024-03-06 | Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer | Naifu Xue et.al. | 2403.03736 | null |
2024-03-06 | ZF Beamforming Tensor Compression for Massive MIMO Fronthaul | Libin Zheng et.al. | 2403.03675 | null |
2024-03-06 | Space Complexity of Euclidean Clustering | Xiaoyi Zhu et.al. | 2403.02971 | null |
2024-03-05 | Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity | Hagyeong Lee et.al. | 2403.02944 | link |
2024-03-05 | Enhancing the Rate-Distortion-Perception Flexibility of Learned Image Codecs with Conditional Diffusion Decoders | Daniele Mari et.al. | 2403.02887 | null |
2024-03-04 | Dark Energy Survey Year 3 results: likelihood-free, simulation-based |
N. Jeffrey et.al. | 2403.02314 | null |
2024-03-04 | Neural Network Assisted Lifting Steps For Improved Fully Scalable Lossy Image Compression in JPEG 2000 | Xinyue Li et.al. | 2403.01647 | link |
2024-03-03 | On the Compressibility of Quantized Large Language Models | Yu Mao et.al. | 2403.01384 | null |
2024-03-02 | Towards Accurate Lip-to-Speech Synthesis in-the-Wild | Sindhu Hegde et.al. | 2403.01087 | null |
2024-03-01 | Region-Adaptive Transform with Segmentation Prior for Image Compression | Yuxi Liu et.al. | 2403.00628 | link |
2024-03-07 | ODVista: An Omnidirectional Video Dataset for super-resolution and Quality Enhancement Tasks | Ahmed Telili et.al. | 2403.00604 | link |
2024-02-29 | Towards Explaining Deep Neural Network Compression Through a Probabilistic Latent Space | Mahsa Mozafari-Nia et.al. | 2403.00155 | null |
2024-02-29 | Deep Network for Image Compressed Sensing Coding Using Local Structural Sampling | Wenxue Cui et.al. | 2402.19111 | null |
2024-02-29 | Variable-Rate Learned Image Compression with Multi-Objective Optimization and Quantization-Reconstruction Offsets | Fatih Kamisli et.al. | 2402.18930 | link |
2024-02-29 | Towards Backward-Compatible Continual Learning of Image Compression | Zhihao Duan et.al. | 2402.18862 | link |
2024-02-29 | Exploration of Learned Lifting-Based Transform Structures for Fully Scalable and Accessible Wavelet-Like Image Compression | Xinyue Li et.al. | 2402.18761 | null |
2024-01-10 | Motion Guided Token Compression for Efficient Masked Video Modeling | Yukun Feng et.al. | 2402.18577 | null |
2024-02-28 | Tokenization Is More Than Compression | Craig W. Schmidt et.al. | 2402.18376 | link |
2024-02-28 | NERV++: An Enhanced Implicit Neural Video Representation | Ahmed Ghorbel et.al. | 2402.18305 | null |
2024-02-28 | Computing Minimal Absent Words and Extended Bispecial Factors with CDAWG Space | Shunsuke Inenaga et.al. | 2402.18090 | null |
2024-03-03 | Towards Optimal Learning of Language Models | Yuxian Gu et.al. | 2402.17759 | null |
2024-02-27 | Gaoyuan Wang et.al. | 2402.17749 | null | |
2024-02-27 | Bit Rate Matching Algorithm Optimization in JPEG-AI Verification Model | Panqi Jia et.al. | 2402.17487 | null |
2024-02-27 | Bit Distribution Study and Implementation of Spatial Quality Map in the JPEG-AI Standardization | Panqi Jia et.al. | 2402.17470 | null |
2024-02-29 | Neural Video Compression with Feature Modulation | Jiahao Li et.al. | 2402.17414 | link |
2024-01-19 | MB-RACS: Measurement-Bounds-based Rate-Adaptive Image Compressed Sensing Network | Yujun Huang et.al. | 2402.16855 | null |
2024-02-29 | MISC: Ultra-low Bitrate Image Semantic Compression Driven by Large Multimodal Model | Chunyi Li et.al. | 2402.16749 | link |
2024-02-26 | Enabling robust sensor network design with data processing and optimization making use of local beehive image and video files | Ephrance Eunice Namugenyi et.al. | 2402.16655 | null |
2024-02-26 | Resolution-Agnostic Neural Compression for High-Fidelity Portrait Video Conferencing via Implicit Radiance Fields | Yifei Li et.al. | 2402.16599 | null |
2024-02-26 | Distortion-Controlled Dithering with Reduced Recompression Rate | Morriel Kasher et.al. | 2402.16447 | null |
2024-02-26 | Adaptive Online Learning of Separable Path Graph Transforms for Intra-prediction | Wen-Yang Lu et.al. | 2402.16371 | null |
2024-02-26 | SPC-NeRF: Spatial Predictive Compression for Voxel Based Radiance Field | Zetian Song et.al. | 2402.16366 | null |
2024-02-24 | Traditional Transformation Theory Guided Model for Learned Image Compression | Zhiyuan Li et.al. | 2402.15744 | null |
2024-02-22 | Distributed Radiance Fields for Edge Video Compression and Metaverse Integration in Autonomous Driving | Eugen Šlapak et.al. | 2402.14642 | null |
2024-02-21 | Exploring the Limits of Semantic Image Compression at Micro-bits per Pixel | Jordan Dotzel et.al. | 2402.13536 | null |
2024-02-20 | Compressing the two-particle Green's function using wavelets: Theory and application to the Hubbard atom | Emin Moghadas et.al. | 2402.13030 | null |
2024-02-20 | RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models | Xinchen Zhang et.al. | 2402.12908 | link |
2024-02-20 | Transformer-based Learned Image Compression for Joint Decoding and Denoising | Yi-Hsin Chen et.al. | 2402.12888 | null |
2024-02-19 | Weakly Supervised Object Detection in Chest X-Rays with Differentiable ROI Proposal Networks and Soft ROI Pooling | Philip Müller et.al. | 2402.11985 | link |
2024-02-18 | 3D Point Cloud Compression with Recurrent Neural Network and Image Compression Methods | Till Beemelmanns et.al. | 2402.11680 | link |
2024-02-18 | Learning to Learn Faster from Human Feedback with Language Model Predictive Control | Jacky Liang et.al. | 2402.11450 | null |
2024-02-17 | TinyLIC-High efficiency lossy image compression method | Gaocheng Ma et.al. | 2402.11164 | null |
2024-02-15 | Analysis of Neural Video Compression Networks for 360-Degree Video Coding | Andy Regensky et.al. | 2402.10257 | null |
2024-02-14 | Reducing Texture Bias of Deep Neural Networks via Edge Enhancing Diffusion | Edgar Heinert et.al. | 2402.09530 | link |
2024-02-14 | A Comprehensive Review of Software and Hardware Energy Efficiency of Video Decoders | Matthias Kränzler et.al. | 2402.09001 | null |
2024-02-14 | Extreme Video Compression with Pre-trained Diffusion Models | Bohan Li et.al. | 2402.08934 | link |
2024-02-14 | Saliency-aware End-to-end Learned Variable-Bitrate 360-degree Image Compression | Oguzhan Gungordu et.al. | 2402.08862 | null |
2024-02-13 | Learned Image Compression with Text Quality Enhancement | Chih-Yu Lai et.al. | 2402.08643 | null |
2024-02-13 | Motion-Adaptive Inference for Flexible Learned B-Frame Compression | M. Akin Yilmaz et.al. | 2402.08550 | null |
2024-02-21 | A Neural-network Enhanced Video Coding Framework beyond ECM | Yanchen Zhao et.al. | 2402.08397 | null |
2024-02-13 | Improving Image Coding for Machines through Optimizing Encoder via Auxiliary Loss | Kei Iino et.al. | 2402.08267 | null |
2024-02-12 | Distributed Compression in the Era of Machine Learning: A Review of Recent Advances | Ezgi Ozyilkan et.al. | 2402.07997 | null |
2024-02-13 | Towards Meta-Pruning via Optimal Transport | Alexander Theus et.al. | 2402.07839 | link |
2024-02-09 | Parameter estimation for quantum jump unraveling | Marco Radaelli et.al. | 2402.06556 | link |
2024-02-07 | RAGE for the Machine: Image Compression with Low-Cost Random Access for Embedded Applications | Christian D. Rask et.al. | 2402.05974 | null |
2024-02-08 | Sandwiched Compression: Repurposing Standard Codecs with Neural Network Wrappers | Onur G. Guleryuz et.al. | 2402.05887 | link |
2024-02-08 | Joint End-to-End Image Compression and Denoising: Leveraging Contrastive Learning and Multi-Scale Self-ONNs | Yuxin Xie et.al. | 2402.05582 | null |
2024-02-05 | TexShape: Information Theoretic Sentence Embedding for Language Models | H. Kaan Kale et.al. | 2402.05132 | link |
2024-02-07 | Compression of Structured Data with Autoencoders: Provable Benefit of Nonlinearities and Depth | Kevin Kögler et.al. | 2402.05013 | null |
2024-02-06 | A Novel Local and Hyper-Local Multicast Services Transmission Scheme for Beyond 5G Networks | Sweta Singh et.al. | 2402.03963 | null |
2024-02-06 | Cool-chic video: Learned video coding with 800 parameters | Thomas Leguay et.al. | 2402.03179 | link |
2024-02-05 | Perceptual Learned Image Compression via End-to-End JND-Based Optimization | Farhad Pakdaman et.al. | 2402.02836 | null |
2024-02-04 | Discovering More Effective Tensor Network Structure Search Algorithms via Large Language Models (LLMs) | Junhua Zeng et.al. | 2402.02456 | link |
2024-03-04 | RecNet: An Invertible Point Cloud Encoding through Range Image Embeddings for Multi-Robot Map Sharing and Reconstruction | Nikolaos Stathoulopoulos et.al. | 2402.02192 | null |
2024-02-03 | Generative Visual Compression: A Review | Bolin Chen et.al. | 2402.02140 | null |
2024-02-23 | Immersive Video Compression using Implicit Neural Representations | Ho Man Kwan et.al. | 2402.01596 | link |
2024-02-02 | Efficient Dynamic-NeRF Based Volumetric Video Coding with Rate Distortion Optimization | Zhiyu Zhang et.al. | 2402.01380 | null |
2024-02-02 | UCVC: A Unified Contextual Video Compression Framework with Joint P-frame and B-frame Coding | Jiayu Yang et.al. | 2402.01289 | null |
2024-02-02 | Flexible Variational Information Bottleneck: Achieving Diverse Compression with a Single Training | Sota Kudo et.al. | 2402.01238 | link |
2024-02-02 | The O2 software framework and GPU usage in ALICE online and offline reconstruction in Run 3 | Giulio Eulisse et.al. | 2402.01205 | null |
2024-02-01 | Compressed image quality assessment using stacking | S. Farhad Hosseini-Benvidi et.al. | 2402.00993 | null |
2024-02-04 | Evaluating Large Language Models for Generalization and Robustness via Data Compression | Yucheng Li et.al. | 2402.00861 | link |
2024-03-11 | LVC-LGMC: Joint Local and Global Motion Compensation for Learned Video Compression | Wei Jiang et.al. | 2402.00680 | null |
2024-02-01 | Gain of Grain: A Film Grain Handling Toolchain for VVC-based Open Implementations | Vignesh V Menon et.al. | 2402.00622 | null |
2024-01-31 | EPSD: Early Pruning with Self-Distillation for Efficient Model Compression | Dong Chen et.al. | 2402.00084 | null |
2024-01-31 | A Neural Enhancement Post-Processor with a Dynamic AV1 Encoder Configuration Strategy for CLIC 2024 | Darren Ramsook et.al. | 2401.18021 | null |
2024-01-31 | Robustly overfitting latents for flexible neural image compression | Yura Perugachi-Diaz et.al. | 2401.17789 | null |
2024-01-30 | A Group Theoretic Metric for Robot State Estimation Leveraging Chebyshev Interpolation | Varun Agrawal et.al. | 2401.17463 | null |
2024-01-30 | SLIC: A Learned Image Codec Using Structure and Color | Srivatsa Prativadibhayankaram et.al. | 2401.17246 | link |
2024-01-30 | Large Language Model Evaluation via Matrix Entropy | Lai Wei et.al. | 2401.17139 | link |
2024-01-30 | Local integrals of motion in dipole-conserving models with Hilbert space fragmentation | Patrycja Łydżba et.al. | 2401.17097 | null |
2024-01-29 | On Channel Simulation with Causal Rejection Samplers | Daniel Goc et.al. | 2401.16579 | null |
2024-01-29 | Spatial Decomposition and Temporal Fusion based Inter Prediction for Learned Video Compression | Xihua Sheng et.al. | 2401.15864 | null |
2024-01-29 | Bayesian one- and two-sided inference on the local effective dimension | Eduard Belitser et.al. | 2401.15816 | null |
2024-01-28 | Towards Arbitrary-Scale Histopathology Image Super-resolution: An Efficient Dual-branch Framework via Implicit Self-texture Enhancement | Minghong Duan et.al. | 2401.15613 | null |
2024-01-26 | Shadow simulation of quantum processes | Xuanqiang Zhao et.al. | 2401.14934 | null |
2024-01-26 | Study of the gOMP Algorithm for Recovery of Compressed Sensed Hyperspectral Images | Jon Alvarez Justo et.al. | 2401.14786 | null |
2024-01-26 | A Comparative Study of Compressive Sensing Algorithms for Hyperspectral Imaging Reconstruction | Jon Alvarez Justo et.al. | 2401.14762 | null |
2024-01-26 | Residual Quantization with Implicit Neural Codebooks | Iris Huijben et.al. | 2401.14732 | link |
2024-01-25 | Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression | Daxin Li et.al. | 2401.14007 | null |
2024-02-07 | Perceptual-oriented Learned Image Compression with Dynamic Kernel | Nianxiang Fu et.al. | 2401.13967 | null |
2024-01-25 | Conditional Neural Video Coding with Spatial-Temporal Super-Resolution | Henan Wang et.al. | 2401.13959 | null |
2024-01-24 | FLLIC: Functionally Lossless Image Compression | Xi Zhang et.al. | 2401.13616 | null |
2024-01-23 | Fast Implicit Neural Representation Image Codec in Resource-limited Devices | Xiang Liu et.al. | 2401.12587 | null |
2024-01-22 | PairwiseHist: Fast, Accurate and Space-Efficient Approximate Query Processing with Data Compression | Aaron Hurst et.al. | 2401.12018 | null |
2024-01-22 | A Training-Free Defense Framework for Robust Learned Image Compression | Myungseo Song et.al. | 2401.11902 | null |
2024-01-21 | Another Way to the Top: Exploit Contextual Clustering in Learned Image Coding | Yichi Zhang et.al. | 2401.11615 | null |
2024-01-21 | ColorVideoVDP: A visual difference predictor for image, video and display distortions | Rafal K. Mantiuk et.al. | 2401.11485 | link |
2024-01-21 | Data-driven compression of electron-phonon interactions | Yao Luo et.al. | 2401.11393 | null |
2024-01-20 | Learned Image Compression with Dual-Branch Encoder and Conditional Information Coding | Haisheng Fu et.al. | 2401.11093 | null |
2024-01-19 | NN-VVC: Versatile Video Coding boosted by self-supervisedly learned image coding for machines | Jukka I. Ahonen et.al. | 2401.10761 | null |
2024-01-19 | Bridging the gap between image coding for machines and humans | Nam Le et.al. | 2401.10732 | null |
2024-01-18 | Attack and Defense Analysis of Learned Image Compression | Tianyu Zhu et.al. | 2401.10345 | null |
2024-01-18 | Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions | Namitha Padmanabhan et.al. | 2401.10217 | null |
2024-01-18 | Depth Over RGB: Automatic Evaluation of Open Surgery Skills Using Depth Camera | Ido Zuckerman et.al. | 2401.10037 | null |
2024-01-18 | Memory Efficient Corner Detection for Event-driven Dynamic Vision Sensors | Pao-Sheng Vincent Sun et.al. | 2401.09797 | null |
2024-01-18 | Compressing MIMO Channel Submatrices with Tucker Decomposition: Enabling Efficient Storage and Reducing SINR Computation Overhead | Yuanwei Zhang et.al. | 2401.09792 | null |
2024-01-17 | Idempotence and Perceptual Image Compression | Tongda Xu et.al. | 2401.08920 | link |
2024-01-16 | End-to-End Optimized Image Compression with the Frequency-Oriented Transform | Yuefeng Zhang et.al. | 2401.08194 | null |
2024-01-17 | Learned Image Compression with ROI-Weighted Distortion and Bit Allocation | Wei Jiang et.al. | 2401.08154 | null |
2024-01-15 | Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning | Manish Sharma et.al. | 2401.08014 | null |
2024-01-15 | Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models | Dan Jacobellis et.al. | 2401.07957 | link |
2024-01-14 | Exploring Compressed Image Representation as a Perceptual Proxy: A Study | Chen-Hsiu Huang et.al. | 2401.07200 | link |
2024-01-13 | Progressive Feature Fusion Network for Enhancing Image Quality Assessment | Kaiqun Wu et.al. | 2401.06992 | null |
2024-01-12 | Efficient Parallel Algorithms for Inpainting-Based Representations of 4K Images -- Part II: Spatial and Tonal Data Optimization | Niklas Kämper et.al. | 2401.06747 | null |
2024-03-18 | LiDAR Depth Map Guided Image Compression Model | Alessandro Gnutti et.al. | 2401.06517 | null |
2024-01-11 | Transformer Masked Autoencoders for Next-Generation Wireless Communications: Architecture and Opportunities | Abdullah Zayat et.al. | 2401.06274 | null |
2024-01-11 | MGARD: A multigrid framework for high-performance, error-controlled data compression and refactoring | Qian Gong et.al. | 2401.05994 | null |
2024-01-10 | SnapCap: Efficient Snapshot Compressive Video Captioning | Jianqiao Sun et.al. | 2401.04903 | null |
2024-01-09 | Modified Levenberg-Marquardt Algorithm For Tensor CP Decomposition in Image Compression | Ramin Goudarzi Karim et.al. | 2401.04670 | null |
2024-01-09 | Optimal Transcoding Resolution Prediction for Efficient Per-Title Bitrate Ladder Estimation | Jinhai Yang et.al. | 2401.04405 | null |
2024-01-08 | Low-light Image Enhancement via CLIP-Fourier Guided Wavelet Diffusion | Minglong Xue et.al. | 2401.03788 | link |
2024-01-08 | A Video Coding Method Based on Neural Network for CLIC2024 | Zhengang Li et.al. | 2401.03623 | null |
2024-01-06 | Spatiotemporally adaptive compression for scientific dataset with feature preservation -- a case study on simulation data with extreme climate events analysis | Qian Gong et.al. | 2401.03317 | null |
2024-01-06 | Comparison of spectrum models as applied to single-particle |
Thomas A. Trainor et.al. | 2401.03290 | null |
2024-01-06 | Transferable Learned Image Compression-Resistant Adversarial Perturbations | Yang Sui et.al. | 2401.03115 | null |
2024-01-05 | MsDC-DEQ-Net: Deep Equilibrium Model (DEQ) with Multi-scale Dilated Convolution for Image Compressive Sensing (CS) | Youhao Yu et.al. | 2401.02884 | null |
2024-03-08 | Importance Matching Lemma for Lossy Compression with Side Information | Buu Phan et.al. | 2401.02609 | null |
2024-01-04 | Cool-Chic: Perceptually Tuned Low Complexity Overfitted Image Coder | Théo Ladune et.al. | 2401.02156 | link |
2024-01-04 | ED: Perceptually tuned Enhanced Compression Model | Pierrick Philippe et.al. | 2401.02145 | null |
2024-01-02 | NU-Class Net: A Novel Deep Learning-based Approach for Video Quality Enhancement | Parham Zilouchian Moghaddam et.al. | 2401.01163 | null |
2024-01-28 | Higher-Order Cellular Automata Generated Symmetry-Protected Topological Phases and Detection Through Multi-Point Strange Correlators | Jie-Yu Zhang et.al. | 2401.00505 | null |
2023-12-28 | Selective Run-Length Encoding | Xutan Peng et.al. | 2312.17024 | null |
2023-12-29 | FFCA-Net: Stereo Image Compression via Fast Cascade Alignment of Side Information | Yichong Xia et.al. | 2312.16963 | null |
2023-12-26 | Range Entropy Queries and Partitioning | Sanjay Krishnan et.al. | 2312.15959 | null |
2023-12-25 | MaskCRT: Masked Conditional Residual Transformer for Learned Video Compression | Yi-Hsin Chen et.al. | 2312.15829 | null |
2023-12-25 | On Robust Wasserstein Barycenter: The Model and Algorithm | Xu Wang et.al. | 2312.15762 | null |
2023-12-25 | Scalable Face Image Coding via StyleGAN Prior: Towards Compression for Human-Machine Collaborative Vision | Qi Mao et.al. | 2312.15622 | null |
2023-12-22 | The Rate-Distortion-Perception-Classification Tradeoff: Joint Source Coding and Modulation via Inverse-Domain GANs | Junli Fang et.al. | 2312.14792 | null |
2024-01-09 | Enhanced Color Palette Modeling for Lossless Screen Content Compression | Hannah Och et.al. | 2312.14491 | null |
2023-12-30 | Efficient Communication in Federated Learning Using Floating-Point Lossy Compression | Grant Wilkins et.al. | 2312.13461 | null |
2023-12-19 | A Huffman based short message service compression technique using adjacent distance array | Pranta Sarker et.al. | 2312.12495 | null |
2023-12-19 | Full-reference Video Quality Assessment for User Generated Content Transcoding | Zihao Qi et.al. | 2312.12317 | null |
2023-12-19 | Low-Consumption Partial Transcoding by HEVC | Mohsen Abdoli et.al. | 2312.12174 | link |
2023-12-19 | Comparative Study of Hardware and Software Power Measurements in Video Compression | Angeliki Katsenou et.al. | 2312.12150 | link |
2023-12-18 | Blind-Touch: Homomorphic Encryption-Based Distributed Neural Network Inference for Privacy-Preserving Fingerprint Authentication | Hyunmin Choi et.al. | 2312.11575 | link |
2024-01-11 | Quantized Decoder in Learned Image Compression for Deterministic Reconstruction | Esin Koyuncu et.al. | 2312.11209 | null |
2023-12-19 | A Computationally Efficient Neural Video Compression Accelerator Based on a Sparse CNN-Transformer Hybrid Network | Siyu Zhang et.al. | 2312.10716 | null |
2023-12-17 | IntraSeismic: a coordinate-based learning approach to seismic inversion | Juan Romero et.al. | 2312.10568 | null |
2023-12-17 | Light-weight CNN-based VVC Inter Partitioning Acceleration | Yiqun Liu et.al. | 2312.10567 | null |
2023-12-16 | Statistical Analysis of Inter Coding in VVC Test Model (VTM) | Yiqun Liu et.al. | 2312.10406 | null |
2023-12-15 | IQNet: Image Quality Assessment Guided Just Noticeable Difference Prefiltering For Versatile Video Coding | Yu-Han Sun et.al. | 2312.09799 | null |
2023-12-15 | Towards Neuromorphic Compression based Neural Sensing for Next-Generation Wireless Implantable Brain Machine Interface | Vivek Mohan et.al. | 2312.09503 | null |
2023-12-14 | Geometry-Corrected Geodesic Motion Modeling with Per-Frame Camera Motion for 360-Degree Video Compression | Andy Regensky et.al. | 2312.09266 | link |
2023-12-14 | Efficient Online Learning of Contact Force Models for Connector Insertion | Kevin Tracy et.al. | 2312.09190 | null |
2023-12-13 | Balanced and Deterministic Weight-sharing Helps Network Performance | Oscar Chang et.al. | 2312.08401 | null |
2023-12-13 | Preparing VVC for Streaming: A Fast Multi-Rate Encoding Approach | Yiqun Liu et.al. | 2312.08330 | null |
2023-12-13 | CenterGrasp: Object-Aware Implicit Representation Learning for Simultaneous Shape Reconstruction and 6-DoF Grasp Estimation | Eugenio Chisari et.al. | 2312.08240 | null |
2023-12-13 | Explainable Trajectory Representation through Dictionary Learning | Yuanbo Tang et.al. | 2312.08052 | null |
2023-12-12 | Deep Hierarchical Video Compression | Ming Lu et.al. | 2312.07126 | null |
2023-12-12 | Communication Cost Reduction for Subgraph Counting under Local Differential Privacy via Hash Functions | Quentin Hillebrand et.al. | 2312.07055 | link |
2023-12-11 | RAFIC: Retrieval-Augmented Few-shot Image Classification | Hangfei Lin et.al. | 2312.06868 | link |
2023-12-11 | A New Projection Pursuit Index for Big Data | Yajie Duan et.al. | 2312.06465 | null |
2023-12-11 | Variational Auto-Encoder Based Deep Learning Technique For Filling Gaps in Reacting PIV Data | Shashank Yellapantula et.al. | 2312.06461 | null |
2023-12-07 | Analysis of Coding Gain Due to In-Loop Reshaping | Chau-Wai Wong et.al. | 2312.04022 | null |
2023-12-05 | C3: High-performance and low-complexity neural compression from a single image or video | Hyunjik Kim et.al. | 2312.02753 | null |
2023-12-05 | Unified learning-based lossy and lossless JPEG recompression | Jianghui Zhang et.al. | 2312.02705 | null |
2023-12-05 | Accelerating Learnt Video Codecs with Gradient Decay and Layer-wise Distillation | Tianhao Peng et.al. | 2312.02605 | null |
2023-12-04 | Hyperspectral Image Compression Using Sampling and Implicit Neural Representations | Shima Rezasoltani et.al. | 2312.01558 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2025-07-23 | Yume: An Interactive World Generation Model | Xiaofeng Mao et.al. | 2507.17744 | null |
2025-07-23 | STQE: Spatial-Temporal Quality Enhancement for G-PCC Compressed Dynamic Point Clouds | Tian Guo et.al. | 2507.17522 | null |
2025-07-23 | Unfolding Data Quality Dimensions in Practice: A Survey | Vasileios Papastergios et.al. | 2507.17507 | null |
2025-07-23 | DFDNet: Dynamic Frequency-Guided De-Flare Network | Minglong Xue et.al. | 2507.17489 | null |
2025-07-23 | Development of a Standardized Testing Environment for QRNGs based on Semiconductor Laser Phase Noise | Matthias Ostner et.al. | 2507.17471 | null |
2025-07-23 | Parametric Integration with Neural Integral Operators | Christoph Schied et.al. | 2507.17440 | null |
2025-07-23 | CAPRI-CT: Causal Analysis and Predictive Reasoning for Image Quality Optimization in Computed Tomography | Sneha George Gnanakalavathy et.al. | 2507.17420 | null |
2025-07-23 | Perceptual Classifiers: Detecting Generative Images using Perceptual Features | Krishna Srikar Durbha et.al. | 2507.17240 | null |
2025-07-23 | Hierarchical Fusion and Joint Aggregation: A Multi-Level Feature Representation Method for AIGC Image Quality Assessment | Linghe Meng et.al. | 2507.17182 | null |
2025-07-23 | UNICE: Training A Universal Image Contrast Enhancer | Ruodai Cui et.al. | 2507.17157 | null |
2025-07-22 | A Tutorial on MRI Reconstruction: From Modern Methods to Clinical Implications | Tolga Çukur et.al. | 2507.16715 | null |
2025-07-22 | Introducing Quality Estimation to Machine Translation Post-editing Workflow: An Empirical Study on Its Usefulness | Siqi Liu et.al. | 2507.16515 | null |
2025-07-22 | DenseSR: Image Shadow Removal as Dense Prediction | Yu-Fan Lin et.al. | 2507.16472 | null |
2025-07-22 | Navigating Large-Pose Challenge for High-Fidelity Face Reenactment with Video Diffusion Model | Mingtao Guo et.al. | 2507.16341 | null |
2025-07-22 | LMM4Edit: Benchmarking and Evaluating Multimodal Image Editing with LMMs | Zitong Xu et.al. | 2507.16193 | null |
2025-07-22 | LSSGen: Leveraging Latent Space Scaling in Flow and Diffusion for Efficient Text to Image Generation | Jyun-Ze Tang et.al. | 2507.16154 | null |
2025-07-21 | Improving Personalized Image Generation through Social Context Feedback | Parul Gupta et.al. | 2507.16095 | null |
2025-07-21 | "Just a strange pic": Evaluating 'safety' in GenAI Image safety annotation tasks from diverse annotators' perspectives | Ding Wang et.al. | 2507.16033 | null |
2025-07-21 | From Logic to Language: A Trust Index for Problem Solving with LLMs | Tehseen Rug et.al. | 2507.16028 | null |
2025-07-21 | Dream, Lift, Animate: From Single Images to Animatable Gaussian Avatars | Marcel C. Bühler et.al. | 2507.15979 | null |
2025-07-21 | A Lightweight Face Quality Assessment Framework to Improve Face Verification Performance in Real-Time Screening Applications | Ahmed Aman Ibrahim et.al. | 2507.15961 | null |
2025-07-21 | Efficient Face Image Quality Assessment via Self-training and Knowledge Distillation | Wei Sun et.al. | 2507.15709 | null |
2025-07-23 | Visual-Language Model Knowledge Distillation Method for Image Quality Assessment | Yongkang Hou et.al. | 2507.15680 | null |
2025-07-21 | SustainDiffusion: Optimising the Social and Environmental Sustainability of Stable Diffusion Models | Giordano d'Aloisio et.al. | 2507.15663 | null |
2025-07-21 | tiDAS: a time invariant approximation of the Delay and Sum algorithm for biomedical ultrasound PSF reconstructions | Chiara Razzetta et.al. | 2507.15464 | null |
2025-07-21 | Neuro-MSBG: An End-to-End Neural Model for Hearing Loss Simulation | Hui-Guan Yuan et.al. | 2507.15396 | null |
2025-07-21 | BEAM-Net: A Deep Learning Framework with Bone Enhancement Attention Mechanism for High Resolution High Frame Rate Ultrasound Beamforming | Midhila Madhusoodanan et.al. | 2507.15306 | null |
2025-07-23 | EndoControlMag: Robust Endoscopic Vascular Motion Magnification with Periodic Reference Resetting and Hierarchical Tissue-aware Dual-Mask Contro | An Wang et.al. | 2507.15292 | null |
2025-07-21 | Conditional Video Generation for High-Efficiency Video Compression | Fangqiu Yi et.al. | 2507.15269 | null |
2025-07-20 | RefCritic: Training Long Chain-of-Thought Critic Models with Refinement Feedback | Qiaoyu Tang et.al. | 2507.15024 | null |
2025-07-20 | FastSmoothSAM: A Fast Smooth Method For Segment Anything Model | Jiasheng Xu et.al. | 2507.15008 | null |
2025-07-20 | Rate-Distortion-Perception Trade-off with Strong Realism Constraints: Role of Side Information and Common Randomness | Yassine Hamdi et.al. | 2507.14825 | null |
2025-07-20 | Distilling Parallel Gradients for Fast ODE Solvers of Diffusion Models | Beier Zhu et.al. | 2507.14797 | null |
2025-07-19 | QUTCC: Quantile Uncertainty Training and Conformal Calibration for Imaging Inverse Problems | Cassandra Tong Ye et.al. | 2507.14760 | null |
2025-07-19 | Benchmarking GANs, Diffusion Models, and Flow Matching for T1w-to-T2w MRI Translation | Andrea Moschetto et.al. | 2507.14575 | null |
2025-07-19 | Adaptive 3D Gaussian Splatting Video Streaming: Visual Saliency-Aware Tiling and Meta-Learning-Based Bitrate Adaptation | Han Gong et.al. | 2507.14454 | null |
2025-07-19 | Adaptive 3D Gaussian Splatting Video Streaming | Han Gong et.al. | 2507.14432 | null |
2025-07-18 | Hallucination Score: Towards Mitigating Hallucinations in Generative Image Super-Resolution | Weiming Ren et.al. | 2507.14367 | null |
2025-07-21 | Lessons from the TREC Plain Language Adaptation of Biomedical Abstracts (PLABA) track | Brian Ondov et.al. | 2507.14096 | null |
2025-07-18 | D2IP: Deep Dynamic Image Prior for 3D Time-sequence Pulmonary Impedance Imaging | Hao Fang et.al. | 2507.14046 | null |
2025-07-18 | Converting T1-weighted MRI from 3T to 7T quality using deep learning | Malo Gicquel et.al. | 2507.13782 | null |
2025-07-18 | Encapsulated Composition of Text-to-Image and Text-to-Video Models for High-Quality Video Synthesis | Tongtong Su et.al. | 2507.13753 | null |
2025-07-18 | ATRO: A Fast Solver-Free Algorithm for Topology and Routing Optimization of Reconfigurable Datacenter Networks | Yingming Mao et.al. | 2507.13717 | null |
2025-07-18 | Global Modeling Matters: A Fast, Lightweight and Effective Baseline for Efficient Image Restoration | Xingyu Jiang et.al. | 2507.13663 | null |
2025-07-18 | Isotropic Remeshing with Inter-Angle Optimization | Hanbing Zheng et.al. | 2507.13641 | null |
2025-07-18 | Unifying Listener Scoring Scales: Comparison Learning Framework for Speech Quality Assessment and Continuous Speech Emotion Recognition | Cheng-Hung Hu et.al. | 2507.13626 | null |
2025-07-18 | Efficient Burst Super-Resolution with One-step Diffusion | Kento Kawai et.al. | 2507.13607 | null |
2025-07-18 | TexGS-VolVis: Expressive Scene Editing for Volume Visualization via Textured Gaussian Splatting | Kaiyuan Tang et.al. | 2507.13586 | null |
2025-07-17 | Dmitrii Mikhailov et.al. | 2507.13546 | null | |
2025-07-17 | IConMark: Robust Interpretable Concept-Based Watermark For AI Images | Vinu Sankar Sadasivan et.al. | 2507.13407 | null |
2025-07-17 | Taming Diffusion Transformer for Real-Time Mobile Video Generation | Yushu Wu et.al. | 2507.13343 | null |
2025-07-17 | Label-Consistent Dataset Distillation with Detector-Guided Refinement | Yawen Zou et.al. | 2507.13074 | null |
2025-07-17 | Enkidu: Universal Frequential Perturbation for Real-Time Audio Privacy Protection against Voice Deepfakes | Zhou Feng et.al. | 2507.12932 | null |
2025-07-17 | DeQA-Doc: Adapting DeQA-Score to Document Image Quality Assessment | Junjie Gao et.al. | 2507.12796 | null |
2025-07-17 | Local Representative Token Guided Merging for Text-to-Image Generation | Min-Jeong Lee et.al. | 2507.12771 | null |
2025-07-17 | HairShifter: Consistent and High-Fidelity Video Hair Transfer via Anchor-Guided Animation | Wangzheng Shi et.al. | 2507.12758 | null |
2025-07-17 | Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images | Zahra TehraniNasab et.al. | 2507.12698 | null |
2025-07-16 | TRIQA: Image Quality Assessment by Contrastive Pretraining on Ordered Distortion Triplets | Rajesh Sureddi et.al. | 2507.12687 | null |
2025-07-16 | InSight: AI Mobile Screening Tool for Multiple Eye Disease Detection using Multimodal Fusion | Ananya Raghu et.al. | 2507.12669 | null |
2025-07-16 | Pathology-Guided Virtual Staining Metric for Evaluation and Training | Qiankai Wang et.al. | 2507.12624 | null |
2025-07-16 | FADE: Adversarial Concept Erasure in Flow Models | Zixuan Fu et.al. | 2507.12283 | null |
2025-07-16 | Translationese-index: Using Likelihood Ratios for Graded and Generalizable Measurement of Translationese | Yikang Liu et.al. | 2507.12260 | null |
2025-07-16 | MambaRate: Speech Quality Assessment Across Different Sampling Rates | Panos Kakoulidis et.al. | 2507.12090 | null |
2025-07-16 | CompressedVQA-HDR: Generalized Full-reference and No-reference Quality Assessment Models for Compressed High Dynamic Range Videos | Wei Sun et.al. | 2507.11900 | null |
2025-07-15 | JSQA: Speech Quality Assessment with Perceptually-Inspired Contrastive Pretraining Based on JND Audio Pairs | Junyi Fan et.al. | 2507.11636 | null |
2025-07-14 | Expert Operational GANS: Towards Real-Color Underwater Image Restoration | Ozer Can Devecioglu et.al. | 2507.11562 | null |
2025-07-14 | 3D Wavelet Latent Diffusion Model for Whole-Body MR-to-CT Modality Translation | Jiaxu Zheng et.al. | 2507.11557 | null |
2025-07-15 | CATVis: Context-Aware Thought Visualization | Tariq Mehmood et.al. | 2507.11522 | null |
2025-07-15 | P.808 Multilingual Speech Enhancement Testing: Approach and Results of URGENT 2025 Challenge | Marvin Sach et.al. | 2507.11306 | null |
2025-07-15 | Robust ID-Specific Face Restoration via Alignment Learning | Yushun Fang et.al. | 2507.10943 | null |
2025-07-15 | Evaluating Generated Commit Messages with Large Language Models | Qunhong Zeng et.al. | 2507.10906 | null |
2025-07-15 | Digital defocus aberration interference for automated optical microscopy | Haowen Zhou et.al. | 2507.10867 | null |
2025-07-16 | Text-Visual Semantic Constrained AI-Generated Image Quality Assessment | Qiang Li et.al. | 2507.10432 | null |
2025-07-14 | Spatial Lifting for Dense Prediction | Mingzhi Xu et.al. | 2507.10222 | null |
2025-07-14 | Nonlinear Spectral Fusion Super-Resolution Fluorescence Microscopy based on Progressively Saturated Upconversion Nanoparticles | Yongtao Liu et.al. | 2507.10129 | null |
2025-07-15 | Graph-based Multi-Modal Interaction Lightweight Network for Brain Tumor Segmentation (GMLN-BTS) in Edge Iterative MRI Lesion Localization System (EdgeIMLocSys) | Guohao Huo et.al. | 2507.09995 | null |
2025-07-14 | Aligning Generative Speech Enhancement with Human Preferences via Direct Preference Optimization | Haoyang Li et.al. | 2507.09929 | null |
2025-07-15 | IM-LUT: Interpolation Mixing Look-Up Tables for Image Super-Resolution | Sejin Park et.al. | 2507.09923 | null |
2025-07-13 | Towards Robust RTC in Sparse LEO Constellations | Aashish Gottipati et.al. | 2507.09798 | null |
2025-07-13 | Continental scale habitat modelling with artificial intelligence and multimodal earth observation | Sara Si-Moussi et.al. | 2507.09732 | null |
2025-07-13 | Hybrid Quantum-Classical Generative Adversarial Networks with Transfer Learning | Asma Al-Othni et.al. | 2507.09706 | null |
2025-07-13 | prNet: Data-Driven Phase Retrieval via Stochastic Refinement | Mehmet Onurcan Kaya et.al. | 2507.09608 | null |
2025-07-13 | Demystifying Flux Architecture | Or Greenberg et.al. | 2507.09595 | null |
2025-07-13 | WordCraft: Interactive Artistic Typography with Attention Awareness and Noise Blending | Zhe Wang et.al. | 2507.09573 | null |
2025-07-13 | RectifiedHR: High-Resolution Diffusion via Energy Profiling and Adaptive Guidance Scheduling | Ankit Sanjyal et.al. | 2507.09441 | null |
2025-07-12 | Deep Image Prior Assisted ISAR Imaging for Missing Data Case | Necmettin Bayar et.al. | 2507.09393 | null |
2025-07-11 | LLMCup: Ranking-Enhanced Comment Updating with LLMs | Hua Ge et.al. | 2507.08671 | null |
2025-07-11 | Refraction corrected specular beamforming applied to cortical bone enhances interface visibility of bone-soft tissues interfaces | Amadou S. Dia et.al. | 2507.08497 | null |
2025-07-11 | Upsample What Matters: Region-Adaptive Latent Sampling for Accelerated Diffusion Transformers | Wongi Jeong et.al. | 2507.08422 | null |
2025-07-11 | Enforcing Speech Content Privacy in Environmental Sound Recordings using Segment-wise Waveform Reversal | Modan Tailleur et.al. | 2507.08412 | null |
2025-07-11 | From Enhancement to Understanding: Build a Generalized Bridge for Low-light Vision via Semantically Consistent Unsupervised Fine-tuning | Sen Wang et.al. | 2507.08380 | null |
2025-07-11 | Unsupervised Methods for Video Quality Improvement: A Survey of Restoration and Enhancement Techniques | Alexandra Malyugina et.al. | 2507.08375 | null |
2025-07-11 | CoCo-Bot: Energy-based Composable Concept Bottlenecks for Interpretable Generative Models | Sangwon Kim et.al. | 2507.08334 | null |
2025-07-10 | Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling | Haoyu Wu et.al. | 2507.07982 | null |
2025-07-10 | Assessing the Alignment of Audio Representations with Timbre Similarity Ratings | Haokun Tian et.al. | 2507.07764 | null |
2025-07-10 | Energy Efficient p-Circuits for Generative Neural Networks | Lakshmi A. Ghantasala et.al. | 2507.07763 | null |
2025-07-10 | IRAF-SLAM: An Illumination-Robust and Adaptive Feature-Culling Front-End for Visual SLAM in Challenging Environments | Thanh Nguyen Canh et.al. | 2507.07752 | null |
2025-07-10 | Generic Speech Enhancement with Self-Supervised Representation Space Loss | Hiroshi Sato et.al. | 2507.07631 | null |
2025-07-10 | Diffusion-Guided Knowledge Distillation for Weakly-Supervised Low-Light Semantic Segmentation | Chunyan Wang et.al. | 2507.07578 | null |
2025-07-10 | SD-GS: Structured Deformable 3D Gaussians for Efficient Dynamic Scene Reconstruction | Wei Yao et.al. | 2507.07465 | null |
2025-07-10 | Degradation-Agnostic Statistical Facial Feature Transformation for Blind Face Restoration in Adverse Weather Conditions | Chang-Hwan Son et.al. | 2507.07464 | null |
2025-07-10 | Subject grounding to reduce electromagnetic interference for MRI scanners operating in unshielded environments | Beatrice Lena et.al. | 2507.07459 | null |
2025-07-08 | Generative Panoramic Image Stitching | Mathieu Tuli et.al. | 2507.07133 | null |
2025-07-09 | 4KAgent: Agentic Any Image to 4K Super-Resolution | Yushen Zuo et.al. | 2507.07105 | null |
2025-07-10 | Hallucinating 360°: Panoramic Street-View Generation via Local Scenes Diffusion and Probabilistic Prompting | Fei Teng et.al. | 2507.06971 | null |
2025-07-09 | Musical Source Separation Bake-Off: Comparing Objective Metrics with Human Perception | Noah Jaffe et.al. | 2507.06917 | null |
2025-07-09 | Speckle2Self: Self-Supervised Ultrasound Speckle Reduction Without Clean Data | Xuesong Li et.al. | 2507.06828 | null |
2025-07-09 | Democratizing High-Fidelity Co-Speech Gesture Video Generation | Xu Yang et.al. | 2507.06812 | null |
2025-07-09 | Better frame rates or better visuals? An early report of Esports player practice in Dota 2 | Arjun Madhusudan et.al. | 2507.06790 | null |
2025-07-08 | Deprecating Benchmarks: Criteria and Framework | Ayrton San Joaquin et.al. | 2507.06434 | null |
2025-07-08 | LangMamba: A Language-driven Mamba Framework for Low-dose CT Denoising with Vision-language Models | Zhihao Chen et.al. | 2507.06140 | null |
2025-07-08 | Bridging Sequential Deep Operator Network and Video Diffusion: Residual Refinement of Spatio-Temporal PDE Solutions | Jaewan Park et.al. | 2507.06133 | null |
2025-07-08 | Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis | Xintong Hu et.al. | 2507.06116 | null |
2025-07-08 | ScoreAdv: Score-based Targeted Generation of Natural Adversarial Examples via Diffusion Models | Chihan Huang et.al. | 2507.06078 | null |
2025-07-08 | Enhancing Synthetic CT from CBCT via Multimodal Fusion and End-To-End Registration | Maximilian Tschuchnig et.al. | 2507.06067 | null |
2025-07-08 | VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis | Alexandre Symeonidis-Herzig et.al. | 2507.06060 | null |
2025-07-08 | Semantic Certainty Assessment in Vector Retrieval Systems: A Novel Framework for Embedding Quality Evaluation | Y. Du et.al. | 2507.05933 | null |
2025-07-08 | Diffusion Dataset Condensation: Training Your Diffusion Model Faster with Less Data | Rui Huang et.al. | 2507.05914 | null |
2025-07-08 | D-FCGS: Feedforward Compression of Dynamic Gaussian Splatting for Free-Viewpoint Videos | Wenkang Zhang et.al. | 2507.05859 | null |
2025-07-08 | TalkFashion: Intelligent Virtual Try-On Assistant Based on Multimodal Large Language Model | Yujie Hu et.al. | 2507.05790 | null |
2025-07-08 | Text-Guided Token Communication for Wireless Image Transmission | Bole Liu et.al. | 2507.05781 | null |
2025-07-08 | MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos | Rongsheng Wang et.al. | 2507.05675 | null |
2025-07-08 | Diffusion-Based Limited-Angle CT Reconstruction under Noisy Conditions | Jiaqi Guo et.al. | 2507.05647 | null |
2025-07-08 | AdaptaGen: Domain-Specific Image Generation through Hierarchical Semantic Optimization Framework | Suoxiang Zhang et.al. | 2507.05621 | null |
2025-07-07 | Conversational Education at Scale: A Multi-LLM Agent Workflow for Procedural Learning and Pedagogic Quality Assessment | Jiahuan Pei et.al. | 2507.05528 | null |
2025-07-07 | LoomNet: Enhancing Multi-View Image Generation via Latent Space Weaving | Giulio Federico et.al. | 2507.05499 | null |
2025-07-07 | Self-supervised Deep Learning for Denoising in Ultrasound Microvascular Imaging | Lijie Huang et.al. | 2507.05451 | null |
2025-07-07 | Enhancing Underwater Images Using Deep Learning with Subjective Image Quality Integration | Jose M. Montero et.al. | 2507.05393 | null |
2025-07-07 | SegmentDreamer: Towards High-fidelity Text-to-3D Synthesis with Segmented Consistency Trajectory Distillation | Jiahao Zhu et.al. | 2507.05256 | null |
2025-07-07 | In-Context Learning as an Effective Estimator of Functional Correctness of LLM-Generated Code | Susmita Das et.al. | 2507.05200 | null |
2025-07-07 | SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model | Chun Xie et.al. | 2507.05148 | null |
2025-07-07 | Taming the Tri-Space Tension: ARC-Guided Hallucination Modeling and Control for Text-to-Image Generation | Jianjiang Yang et.al. | 2507.04946 | null |
2025-07-07 | HV-MMBench: Benchmarking MLLMs for Human-Centric Video Understanding | Yuxuan Cai et.al. | 2507.04909 | null |
2025-07-07 | Efficacy of Image Similarity as a Metric for Augmenting Small Dataset Retinal Image Segmentation | Thomas Wallace et.al. | 2507.04862 | null |
2025-07-07 | Identity-Preserving Text-to-Video Generation Guided by Simple yet Effective Spatial-Temporal Decoupled Representations | Yuji Wang et.al. | 2507.04705 | null |
2025-07-07 | Quantitative Single-particle Profiling of Extracellular Vesicles via Fluorescent Nanoparticle Tracking Analysis | Yiting Liu et.al. | 2507.04655 | null |
2025-07-07 | Introduction to the China Space Station Telescope (CSST) | CSST Collaboration et.al. | 2507.04618 | null |
2025-07-06 | TeleSim: A Network-Aware Testbed and Benchmark Dataset for Telerobotic Applications | Zexin Deng et.al. | 2507.04425 | null |
2025-07-06 | DMAT: An End-to-End Framework for Joint Atmospheric Turbulence Mitigation and Object Detection | Paul Hill et.al. | 2507.04323 | null |
2025-07-06 | Towards Lightest Low-Light Image Enhancement Architecture for Mobile Devices | Guangrui Bai et.al. | 2507.04277 | null |
2025-07-06 | Siberian radioheliograph image classification using ensemble of CLIP, EfficientNet and CatBoost models | Yaroslav Egorov et.al. | 2507.04211 | null |
2025-07-05 | Towards Spatially-Varying Gain and Binning | Anqi Yang et.al. | 2507.04190 | null |
2025-07-05 | A3FR: Agile 3D Gaussian Splatting with Incremental Gaze Tracked Foveated Rendering in Virtual Reality | Shuo Xin et.al. | 2507.04147 | null |
2025-07-05 | MMMOS: Multi-domain Multi-axis Audio Quality Assessment | Yi-Cheng Lin et.al. | 2507.04094 | null |
2025-07-05 | Gaussian-LIC2: LiDAR-Inertial-Camera Gaussian Splatting SLAM | Xiaolei Lang et.al. | 2507.04004 | null |
2025-07-08 | LEHA-CVQAD: Dataset To Enable Generalized Video Quality Assessment of Compression Artifacts | Aleksandr Gushchin et.al. | 2507.03990 | null |
2025-07-08 | StreamDiT: Real-Time Streaming Text-to-Video Generation | Akio Kodaira et.al. | 2507.03745 | null |
2025-07-04 | TACOS: Open Tagging and Comparative Scoring for Instruction Fine-Tuning Data Selection | Xixiang He et.al. | 2507.03673 | null |
2025-07-03 | RichControl: Structure- and Appearance-Rich Training-Free Spatial Control for Text-to-Image Generation | Liheng Zhang et.al. | 2507.02792 | null |
2025-07-03 | CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation | Xiangyang Luo et.al. | 2507.02691 | null |
2025-07-03 | Medical Data Pecking: A Context-Aware Approach for Automated Quality Evaluation of Structured Medical Data | Irena Girshovitz et.al. | 2507.02628 | null |
2025-07-03 | Addressing Camera Sensors Faults in Vision-Based Navigation: Simulation and Dataset Development | Riccardo Gallon et.al. | 2507.02602 | null |
2025-07-03 | IGDNet: Zero-Shot Robust Underexposed Image Enhancement via Illumination-Guided and Denoising | Hailong Yan et.al. | 2507.02445 | null |
2025-07-03 | Are Synthetic Videos Useful? A Benchmark for Retrieval-Centric Evaluation of Synthetic Videos | Zecheng Zhao et.al. | 2507.02316 | null |
2025-07-03 | MAC-Lookup: Multi-Axis Conditional Lookup Model for Underwater Image Enhancement | Fanghai Yi et.al. | 2507.02270 | null |
2025-07-02 | MobileIE: An Extremely Lightweight and Effective ConvNet for Real-Time Image Enhancement on Mobile Devices | Hailong Yan et.al. | 2507.01838 | null |
2025-07-02 | Enhancing Multi-Exposure High Dynamic Range Imaging with Overlapped Codebook for Improved Representation Learning | Keuntek Lee et.al. | 2507.01588 | null |
2025-07-02 | ReFlex: Text-Guided Editing of Real Images in Rectified Flow via Mid-Step Feature Extraction and Attention Adaptation | Jimyeong Kim et.al. | 2507.01496 | null |
2025-07-02 | SD-Acc: Accelerating Stable Diffusion through Phase-aware Sampling and Hardware Co-Optimizations | Zhican Wang et.al. | 2507.01309 | null |
2025-07-02 | Robust Brain Tumor Segmentation with Incomplete MRI Modalities Using Hölder Divergence and Mutual Information-Enhanced Knowledge Transfer | Runze Cheng et.al. | 2507.01254 | null |
2025-07-01 | Empirical Analysis Of Heuristic and Approximation Algorithms for the The Mutual-Visibility Problem | Vanja Stojanović et.al. | 2507.01076 | null |
2025-07-01 | Diffusion Classifier Guidance for Non-robust Classifiers | Philipp Vaeth et.al. | 2507.00687 | null |
2025-07-01 | Mind the Detail: Uncovering Clinically Relevant Image Details in Accelerated MRI with Semantically Diverse Reconstructions | Jan Nikolas Morshuis et.al. | 2507.00670 | null |
2025-07-03 | MTCNet: Motion and Topology Consistency Guided Learning for Mitral Valve Segmentationin 4D Ultrasound | Rusi Chen et.al. | 2507.00660 | null |
2025-07-01 | Integration of quantum random number generators with post-quantum cryptography algorithms | Paula Alonso Blanco et.al. | 2507.00658 | null |
2025-07-01 | Physics-Informed Neural ODEs for Temporal Dynamics Modeling in Cardiac T1 Mapping | Nuno Capitão et.al. | 2507.00613 | null |
2025-07-01 | Latent Posterior-Mean Rectified Flow for Higher-Fidelity Perceptual Face Restoration | Xin Luo et.al. | 2507.00447 | null |
2025-07-01 | MedDiff-FT: Data-Efficient Diffusion Model Fine-tuning with Structural Guidance for Controllable Medical Image Synthesis | Jianhao Xie et.al. | 2507.00377 | null |
2025-07-01 | GDGS: 3D Gaussian Splatting Via Geometry-Guided Initialization And Dynamic Density Control | Xingjun Wang et.al. | 2507.00363 | null |
2025-06-30 | A High-Fidelity Speech Super Resolution Network using a Complex Global Attention Module with Spectro-Temporal Loss | Tarikul Islam Tamiti et.al. | 2507.00229 | null |
2025-06-30 | How to Design and Train Your Implicit Neural Representation for Video Compression | Matthew Gwilliam et.al. | 2506.24127 | null |
2025-06-30 | Epona: Autoregressive Diffusion World Model for Autonomous Driving | Kaiwen Zhang et.al. | 2506.24113 | null |
2025-06-30 | Navigating with Annealing Guidance Scale in Diffusion Space | Shai Yehezkel et.al. | 2506.24108 | null |
2025-06-30 | URGENT-PK: Perceptually-Aligned Ranking Model Designed for Speech Enhancement Competition | Jiahe Wang et.al. | 2506.23874 | null |
2025-07-03 | RGC-VQA: An Exploration Database for Robotic-Generated Video Quality Assessment | Jianing Jin et.al. | 2506.23852 | null |
2025-06-30 | Transition Matching: Scalable and Flexible Generative Modeling | Neta Shaul et.al. | 2506.23589 | null |
2025-06-30 | Metadata, Wavelet, and Time Aware Diffusion Models for Satellite Image Super Resolution | Luigi Sigillo et.al. | 2506.23566 | null |
2025-06-30 | TAG-WM: Tamper-Aware Generative Image Watermarking via Diffusion Inversion Sensitivity | Yuzhuo Chen et.al. | 2506.23484 | null |
2025-06-29 | Layer Decomposition and Morphological Reconstruction for Task-Oriented Infrared Image Enhancement | Siyuan Chai et.al. | 2506.23353 | null |
2025-06-29 | DiffFit: Disentangled Garment Warping and Texture Refinement for Virtual Try-On | Xiang Xu et.al. | 2506.23295 | null |
2025-06-29 | PixelBoost: Leveraging Brownian Motion for Realistic-Image Super-Resolution | Aradhana Mishra et.al. | 2506.23254 | null |
2025-06-29 | Research on Comprehensive Classroom Evaluation System Based on Multiple AI Models | Cong Xie et.al. | 2506.23079 | null |
2025-06-28 | Point Cloud Compression and Objective Quality Assessment: A Survey | Yiling Xu et.al. | 2506.22902 | null |
2025-06-28 | STR-Match: Matching SpatioTemporal Relevance Score for Training-Free Video Editing | Junsung Lee et.al. | 2506.22868 | null |
2025-06-28 | ICME 2025 Generalizable HDR and SDR Video Quality Measurement Grand Challenge | Yixu Chen et.al. | 2506.22790 | null |
2025-06-28 | VSRM: A Robust Mamba-Based Framework for Video Super-Resolution | Dinh Phu Tran et.al. | 2506.22762 | null |
2025-06-28 | Degradation-Modeled Multipath Diffusion for Tunable Metalens Photography | Jianing Zhang et.al. | 2506.22753 | null |
2025-06-27 | High Resolution Isotropic 3D Cine imaging with Automated Segmentation using Concatenated 2D Real-time Imaging and Deep Learning | Mark Wrobel et.al. | 2506.22532 | null |
2025-07-01 | Dehazing Light Microscopy Images with Guided Conditional Flow Matching: finding a sweet spot between fidelity and realism | Anirban Ray et.al. | 2506.22397 | null |
2025-06-27 | DIGS: Dynamic CBCT Reconstruction using Deformation-Informed 4D Gaussian Splatting and a Low-Rank Free-Form Deformation Model | Yuliang Huang et.al. | 2506.22280 | null |
2025-06-27 | ReF-LLE: Personalized Low-Light Enhancement via Reference-Guided Deep Reinforcement Learning | Ming Zhao et.al. | 2506.22216 | null |
2025-06-27 | RoboEnvision: A Long-Horizon Video Generation Model for Multi-Task Robot Manipulation | Liudi Yang et.al. | 2506.22007 | null |
2025-06-27 | HighRateMOS: Sampling-Rate Aware Modeling for Speech Quality Assessment | Wenze Ren et.al. | 2506.21951 | null |
2025-06-27 | Quality Assessment and Distortion-aware Saliency Prediction for AI-Generated Omnidirectional Images | Liu Yang et.al. | 2506.21925 | null |
2025-06-27 | GenEscape: Hierarchical Multi-Agent Generation of Escape Room Puzzles | Mengyi Shan et.al. | 2506.21839 | null |
2025-06-26 | TanDiT: Tangent-Plane Diffusion Transformer for High-Quality 360° Panorama Generation | Hakan Çapuk et.al. | 2506.21681 | null |
2025-06-26 | Lightweight Physics-Informed Zero-Shot Ultrasound Plane Wave Denoising | Hojat Asgariandehkordi et.al. | 2506.21499 | null |
2025-06-26 | Counterfactual Voting Adjustment for Quality Assessment and Fairer Voting in Online Platforms with Helpfulness Evaluation | Chang Liu et.al. | 2506.21362 | null |
2025-06-26 | Bridging Video Quality Scoring and Justification via Large Multimodal Models | Qizhi Xie et.al. | 2506.21011 | null |
2025-06-26 | Style-Aligned Image Composition for Robust Detection of Abnormal Cells in Cytopathology | Qiuyi Qi et.al. | 2506.21001 | null |
2025-06-26 | 3D Scene-Camera Representation with Joint Camera Photometric Optimization | Weichen Dai et.al. | 2506.20979 | null |
2025-06-26 | Response Quality Assessment for Retrieval-Augmented Generation via Conditional Conformal Factuality | Naihe Feng et.al. | 2506.20978 | null |
2025-06-26 | Consistent Zero-shot 3D Texture Synthesis Using Geometry-aware Diffusion and Temporal Video Models | Donggoo Kang et.al. | 2506.20946 | null |
2025-06-25 | Leveraging Vision-Language Models to Select Trustworthy Super-Resolution Samples Generated by Diffusion Models | Cansu Korkmaz et.al. | 2506.20832 | null |
2025-06-25 | HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling | Tobias Vontobel et.al. | 2506.20452 | null |
2025-06-25 | DreamAnywhere: Object-Centric Panoramic 3D Scene Generation | Edoardo Alberto Dominici et.al. | 2506.20367 | null |
2025-06-25 | FundaQ-8: A Clinically-Inspired Scoring Framework for Automated Fundus Image Quality Assessment | Lee Qi Zun et.al. | 2506.20303 | null |
2025-06-25 | TDiR: Transformer based Diffusion for Image Restoration Tasks | Abbas Anwar et.al. | 2506.20302 | null |
2025-06-25 | Ctrl-Z Sampling: Diffusion Sampling with Controlled Random Zigzag Explorations | Shunqi Mao et.al. | 2506.20294 | null |
2025-06-26 | Signal-to-noise and spatial resolution in in-line imaging. 2. Phase-contrast tomography | T. E. Gureyev et.al. | 2506.20277 | null |
2025-06-25 | RaRa Clipper: A Clipper for Gaussian Splatting Based on Ray Tracer and Rasterizer | Da Li et.al. | 2506.20202 | null |
2025-06-25 | MS-IQA: A Multi-Scale Feature Fusion Network for PET/CT Image Quality Assessment | Siqiao Li et.al. | 2506.20200 | null |
2025-06-24 | Diffusion-based Task-oriented Semantic Communications with Model Inversion Attack | Xuesong Wang et.al. | 2506.19886 | null |
2025-06-24 | Radial Attention: |
Xingyang Li et.al. | 2506.19852 | null |
2025-06-24 | Active View Selector: Fast and Accurate Active View Selection with Cross Reference Image Quality Assessment | Zirui Wang et.al. | 2506.19844 | null |
2025-06-24 | Improving Progressive Generation with Decomposable Flow Matching | Moayed Haji-Ali et.al. | 2506.19839 | null |
2025-06-24 | Evaluating Compliance with Visualization Guidelines in Diagrams for Scientific Publications Using Large Vision Language Models | Johannes Rückert et.al. | 2506.19825 | null |
2025-06-24 | Guidance in the Frequency Domain Enables High-Fidelity Sampling at Low CFG Scales | Seyedmorteza Sadat et.al. | 2506.19713 | null |
2025-06-24 | Filling of incomplete sinograms from sparse PET detector configurations using a residual U-Net | Klara Leffler et.al. | 2506.19600 | null |
2025-06-24 | ReMAR-DS: Recalibrated Feature Learning for Metal Artifact Reduction and CT Domain Transformation | Mubashara Rehman et.al. | 2506.19531 | null |
2025-06-24 | Angio-Diff: Learning a Self-Supervised Adversarial Diffusion Model for Angiographic Geometry Generation | Zhifeng Wang et.al. | 2506.19455 | null |
2025-06-24 | NAADA: A Noise-Aware Attention Denoising Autoencoder for Dental Panoramic Radiographs | Khuram Naveed et.al. | 2506.19387 | null |
2025-06-24 | Learning to assess subjective impressions from speech | Yuto Kondo et.al. | 2506.19335 | null |
2025-06-24 | Style Transfer: A Decade Survey | Tianshan Zhang et.al. | 2506.19278 | null |
2025-06-24 | Automated Image Recognition Framework | Quang-Binh Nguyen et.al. | 2506.19261 | null |
2025-06-23 | VHU-Net: Variational Hadamard U-Net for Body MRI Bias Field Correction | Xin Zhu et.al. | 2506.19181 | null |
2025-06-23 | A B-Spline Finite Element Method for Cloth Simulation | Yuqi Meng et.al. | 2506.18867 | null |
2025-06-23 | Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset | Zhuowei Chen et.al. | 2506.18851 | null |
2025-06-23 | ViDAR: Video Diffusion-Aware 4D Reconstruction From Monocular Inputs | Michal Nazarczuk et.al. | 2506.18792 | null |
2025-06-23 | Matrix-Game: Interactive World Foundation Model | Yifan Zhang et.al. | 2506.18701 | null |
2025-06-23 | RDPO: Real Data Preference Optimization for Physics Consistency Video Generation | Wenxu Qian et.al. | 2506.18655 | null |
2025-06-23 | 2D Triangle Splatting for Direct Differentiable Mesh Training | Kaifeng Sheng et.al. | 2506.18575 | link |
2025-06-23 | VQ-Insight: Teaching VLMs for AI-Generated Video Quality Understanding via Progressive Visual Reinforcement Learning | Xuanyu Zhang et.al. | 2506.18564 | null |
2025-06-23 | What You Think Is What You Get: Bridge User Intent and Transfer Function Design through Multimodal Large Language Models | Yiyao Wang et.al. | 2506.18407 | null |
2025-06-23 | Selecting N-lowest scores for training MOS prediction models | Yuto Kondo et.al. | 2506.18326 | null |
2025-06-23 | ARSAR-Net: Adaptively Regularized SAR Imaging Network via Non-matrix-inversion ADMM | Shiping Fu et.al. | 2506.18324 | null |
2025-06-23 | A Multi-Scale Spatial Attention-Based Zero-Shot Learning Framework for Low-Light Image Enhancement | Muhammad Azeem Aslam et.al. | 2506.18323 | null |
2025-06-23 | Rethinking Mean Opinion Scores in Speech Quality Assessment: Aggregation through Quantized Distribution Fitting | Yuto Kondo et.al. | 2506.18307 | null |
2025-06-22 | InspireDebate: Multi-Dimensional Subjective-Objective Evaluation-Guided Reasoning and Optimization for Debating | Fuyu Wang et.al. | 2506.18102 | null |
2025-06-22 | Face-Voice Association for Audiovisual Active Speaker Detection in Egocentric Recordings | Jason Clarke et.al. | 2506.18055 | null |
2025-06-22 | BPCLIP: A Bottom-up Image Quality Assessment from Distortion to Semantics Based on CLIP | Chenyue Song et.al. | 2506.17969 | null |
2025-06-21 | DreamJourney: Perpetual View Generation with Video Diffusion Models | Bo Pan et.al. | 2506.17705 | null |
2025-06-21 | Histopathology Image Report Generation by Vision Language Model with Multimodal In-Context Learning | Shih-Wen Liu et.al. | 2506.17645 | null |
2025-06-21 | MTSIC: Multi-stage Transformer-based GAN for Spectral Infrared Image Colorization | Tingting Liu et.al. | 2506.17540 | null |
2025-06-20 | The Hidden Cost of an Image: Quantifying the Energy Consumption of AI Image Generation | Giulia Bertazzini et.al. | 2506.17016 | null |
2025-06-20 | PET Tracer Separation Using Conditional Diffusion Transformer with Multi-latent Space Learning | Bin Huang et.al. | 2506.16934 | null |
2025-06-20 | Infrared and Visible Image Fusion Based on Implicit Neural Representations | Shuchen Sun et.al. | 2506.16773 | null |
2025-06-19 | MetaQAP -- A Meta-Learning Approach for Quality-Aware Pretraining in Image Quality Assessment | Muhammad Azeem Aslam et.al. | 2506.16601 | null |
2025-06-19 | DiffO: Single-step Diffusion for Image Compression at Ultra-Low Bitrates | Chanung Park et.al. | 2506.16572 | link |
2025-06-19 | Towards Bitrate-Efficient and Noise-Robust Speech Coding with Variable Bitrate RVQ | Yunkee Chae et.al. | 2506.16538 | null |
2025-06-19 | Data Compression with Relative Entropy Coding | Gergely Flamich et.al. | 2506.16309 | null |
2025-06-19 | Active MRI Acquisition with Diffusion Guided Bayesian Experimental Design | Jacopo Iollo et.al. | 2506.16237 | null |
2025-06-19 | From Coarse to Continuous: Progressive Refinement Implicit Neural Representation for Motion-Robust Anisotropic MRI Reconstruction | Zhenxuan Zhang et.al. | 2506.16210 | null |
2025-06-19 | Neural Prioritisation for Web Crawling | Francesza Pezzuti et.al. | 2506.16146 | null |
2025-06-19 | Enhanced Dermatology Image Quality Assessment via Cross-Domain Training | Ignacio Hernández Montilla et.al. | 2506.16116 | null |
2025-06-19 | Fast Training-free Perceptual Image Compression | Ziran Zhu et.al. | 2506.16102 | null |
2025-06-19 | STAR-Pose: Efficient Low-Resolution Video Human Pose Estimation via Spatial-Temporal Adaptive Super-Resolution | Yucheng Jin et.al. | 2506.16061 | null |
2025-06-19 | Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization | Cong Wang et.al. | 2506.15980 | link |
2025-06-19 | CORAL: Disentangling Latent Representations in Long-Tailed Diffusion | Esther Rodriguez et.al. | 2506.15933 | null |
2025-06-18 | Demystifying the Visual Quality Paradox in Multimodal Large Language Models | Shuo Xing et.al. | 2506.15645 | null |
2025-06-20 | One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution | Yujing Sun et.al. | 2506.15591 | link |
2025-06-18 | Advanced cervical cancer classification: enhancing pap smear images with hybrid PMD Filter-CLAHE | Ach Khozaimi et.al. | 2506.15489 | null |
2025-06-18 | When Model Knowledge meets Diffusion Model: Diffusion-assisted Data-free Image Synthesis with Alignment of Domain and Class | Yujin Kim et.al. | 2506.15381 | null |
2025-06-20 | Privacy-Preserving Chest X-ray Classification in Latent Space with Homomorphically Encrypted Neural Inference | Jonghun Kim et.al. | 2506.15258 | link |
2025-06-18 | RA-NeRF: Robust Neural Radiance Field Reconstruction with Accurate Camera Pose Estimation under Complex Trajectories | Qingsong Yan et.al. | 2506.15242 | null |
2025-06-18 | DM-FNet: Unified multimodal medical image fusion via diffusion process-trained encoder-decoder | Dan He et.al. | 2506.15218 | link |
2025-06-18 | Privacy-Shielded Image Compression: Defending Against Exploitation from Vision-Language Pretrained Models | Xuelin Shen et.al. | 2506.15201 | link |
2025-06-18 | You Only Render Once: Enhancing Energy and Computation Efficiency of Mobile Virtual Reality | Xingyu Chen et.al. | 2506.15183 | null |
2025-06-17 | A Comparative Evaluation of Deep Learning Models for Speech Enhancement in Real-World Noisy Environments | Md Jahangir Alam Khondkar et.al. | 2506.15000 | link |
2025-06-17 | Improved Image Reconstruction and Diffusion Parameter Estimation Using a Temporal Convolutional Network Model of Gradient Trajectory Errors | Jonathan B. Martin et.al. | 2506.14995 | link |
2025-06-17 | Plug-and-Play with 2.5D Artifact Reduction Prior for Fast and Accurate Industrial Computed Tomography Reconstruction | Haley Duba-Sullivan et.al. | 2506.14719 | null |
2025-06-17 | 3DGS-IEval-15K: A Large-scale Image Quality Evaluation Database for 3D Gaussian-Splatting | Yuke Xing et.al. | 2506.14642 | link |
2025-06-17 | QUEST: Quality-aware Semi-supervised Table Extraction for Business Documents | Eliott Thomas et.al. | 2506.14568 | null |
2025-06-17 | Causally Steered Diffusion for Automated Video Counterfactual Generation | Nikos Spyrou et.al. | 2506.14404 | link |
2025-06-17 | Compressed Video Super-Resolution based on Hierarchical Encoding | Yuxuan Jiang et.al. | 2506.14381 | null |
2025-06-18 | ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies | Jinyan Yuan et.al. | 2506.14315 | null |
2025-06-17 | Quality Assessment of Python Tests Generated by Large Language Models | Victor Alves et.al. | 2506.14297 | link |
2025-06-17 | synth-dacl: Does Synthetic Defect Data Enhance Segmentation Accuracy and Robustness for Real-World Bridge Inspections? | Johannes Flotzinger et.al. | 2506.14255 | null |
2025-06-17 | MAS-LitEval : Multi-Agent System for Literary Translation Quality Assessment | Junghwan Kim et.al. | 2506.14199 | null |
2025-06-17 | Breaking the Multi-Enhancement Bottleneck: Domain-Consistent Quality Enhancement for Compressed Images | Qunliang Xing et.al. | 2506.14152 | null |
2025-06-15 | Balancing Preservation and Modification: A Region and Semantic Aware Metric for Instruction-Based Image Editing | Zhuoying Li et.al. | 2506.13827 | null |
2025-06-14 | ReFrame: Layer Caching for Accelerated Inference in Real-Time Rendering | Lufei Liu et.al. | 2506.13814 | null |
2025-06-16 | SpeechRefiner: Towards Perceptual Quality Refinement for Front-End Algorithms | Sirui Li et.al. | 2506.13709 | null |
2025-06-16 | UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions | Zhucun Xue et.al. | 2506.13691 | null |
2025-06-17 | First Positronium Lifetime Imaging with Scandium-44 on a Long Axial Field-of-view PET/CT | Lorenzo Mercolli et.al. | 2506.13460 | null |
2025-06-16 | DicFace: Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration | Yan Chen et.al. | 2506.13355 | null |
2025-06-16 | Efficient Approximate Temporal Triangle Counting in Streaming with Predictions | Giorgio Venturin et.al. | 2506.13173 | link |
2025-06-14 | Towards Seamless Borders: A Method for Mitigating Inconsistencies in Image Inpainting and Outpainting | Xingzhong Hou et.al. | 2506.12530 | null |
2025-06-14 | Fine-Grained HDR Image Quality Assessment From Noticeably Distorted to Very High Fidelity | Mohsen Jenadeleh et.al. | 2506.12505 | null |
2025-06-14 | Real-Time Per-Garment Virtual Try-On with Temporal Consistency for Loose-Fitting Garments | Zaiqiang Wu et.al. | 2506.12348 | link |
2025-06-13 | ICME 2025 Grand Challenge on Video Super-Resolution for Video Conferencing | Babak Naderi et.al. | 2506.12269 | link |
2025-06-13 | Improving Speech Enhancement with Multi-Metric Supervision from Learned Quality Assessment | Wei Wang et.al. | 2506.12260 | null |
2025-06-13 | SphereDrag: Spherical Geometry-Aware Panoramic Image Editing | Zhiao Feng et.al. | 2506.11863 | null |
2025-06-13 | Fast MRI of bones in the knee -- An AI-driven reconstruction approach for adiabatic inversion recovery prepared ultra-short echo time sequences | Philipp Hans Nunn et.al. | 2506.11771 | null |
2025-06-13 | EyeSim-VQA: A Free-Energy-Guided Eye Simulation Framework for Video Quality Assessment | Zhaoyang Wang et.al. | 2506.11549 | null |
2025-06-13 | CGVQM+D: Computer Graphics Video Quality Metric and Dataset | Akshay Jindal et.al. | 2506.11546 | link |
2025-06-13 | Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders | Xingwei Sun et.al. | 2506.11514 | link |
2025-06-13 | Taming Stable Diffusion for Computed Tomography Blind Super-Resolution | Chunlei Li et.al. | 2506.11496 | null |
2025-06-13 | Byzantine Outside, Curious Inside: Reconstructing Data Through Malicious Updates | Kai Yue et.al. | 2506.11413 | null |
2025-06-13 | A Watermark for Auto-Regressive Image Generation Models | Yihan Wu et.al. | 2506.11371 | null |
2025-06-12 | Score-based Generative Diffusion Models to Synthesize Full-dose FDG Brain PET from MRI in Epilepsy Patients | Jiaqi Wu et.al. | 2506.11297 | null |
2025-06-12 | M4V: Multi-Modal Mamba for Text-to-Video Generation | Jiancheng Huang et.al. | 2506.10915 | null |
2025-06-12 | AIR: Zero-shot Generative Model Adaptation with Iterative Refinement | Guimeng Liu et.al. | 2506.10895 | link |
2025-06-12 | Stroke-based Cyclic Amplifier: Image Super-Resolution at Arbitrary Ultra-Large Scales | Wenhao Guo et.al. | 2506.10774 | null |
2025-06-12 | Underage Detection through a Multi-Task and MultiAge Approach for Screening Minors in Unconstrained Imagery | Christopher Gaul et.al. | 2506.10689 | null |
2025-06-12 | Unsourced Adversarial CAPTCHA: A Bi-Phase Adversarial CAPTCHA Framework | Xia Du et.al. | 2506.10685 | null |
2025-06-12 | High-resolution efficient image generation from WiFi CSI using a pretrained latent diffusion model | Eshan Ramesh et.al. | 2506.10605 | null |
2025-06-12 | A Crack in the Bark: Leveraging Public Knowledge to Remove Tree-Ring Watermarks | Junhua Lin et.al. | 2506.10502 | null |
2025-06-12 | Low-Barrier Dataset Collection with Real Human Body for Interactive Per-Garment Virtual Try-On | Zaiqiang Wu et.al. | 2506.10468 | link |
2025-06-12 | PointGS: Point Attention-Aware Sparse View Synthesis with Gaussian Splatting | Lintao Xiang et.al. | 2506.10335 | null |
2025-06-12 | Research on Audio-Visual Quality Assessment Dataset and Method for User-Generated Omnidirectional Video | Fei Zhao et.al. | 2506.10331 | null |
2025-06-12 | DUN-SRE: Deep Unrolling Network with Spatiotemporal Rotation Equivariance for Dynamic MRI Reconstruction | Yuliang Zhu et.al. | 2506.10309 | null |
2025-06-12 | Discrete Audio Tokens: More Than a Survey! | Pooneh Mousavi et.al. | 2506.10274 | null |
2025-06-11 | D-LiFT: Improving LLM-based Decompiler Backend via Code Quality-driven Fine-tuning | Muqi Zou et.al. | 2506.10125 | null |
2025-06-10 | Ambient Diffusion Omni: Training Good Models with Bad Data | Giannis Daras et.al. | 2506.10038 | link |
2025-06-10 | FastFLUX: Pruning FLUX with Block-wise Replacement and Sandwich Training | Fuhan Cai et.al. | 2506.10035 | null |
2025-06-11 | EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits | Ron Yosef et.al. | 2506.09988 | null |
2025-06-11 | Error-Guided Pose Augmentation: Enhancing Rehabilitation Exercise Assessment through Targeted Data Generation | Omar Sherif et.al. | 2506.09833 | null |
2025-06-11 | Metritocracy: Representative Metrics for Lite Benchmarks | Ariel Procaccia et.al. | 2506.09813 | null |
2025-06-11 | Learning Quality from Complexity and Structure: A Feature-Fused XGBoost Model for Video Quality Assessment | Amritha Premkumar et.al. | 2506.09795 | null |
2025-06-11 | A High-Quality Dataset and Reliable Evaluation for Interleaved Image-Text Generation | Yukang Feng et.al. | 2506.09427 | null |
2025-06-10 | Seedance 1.0: Exploring the Boundaries of Video Generation Models | Yu Gao et.al. | 2506.09113 | null |
2025-06-10 | Enhancing Synthetic CT from CBCT via Multimodal Fusion: A Study on the Impact of CBCT Quality and Alignment | Maximilian Tschuchnig et.al. | 2506.08716 | null |
2025-06-10 | Brevity is the soul of sustainability: Characterizing LLM response lengths | Soham Poddar et.al. | 2506.08686 | link |
2025-06-10 | Biologically Inspired Deep Learning Approaches for Fetal Ultrasound Image Classification | Rinat Prochii et.al. | 2506.08623 | null |
2025-06-10 | LiftVSR: Lifting Image Diffusion to Video Super-Resolution via Hybrid Temporal Modeling with Only 4 |
Xijun Wang et.al. | 2506.08529 | null |
2025-06-10 | Enhancing Motion Dynamics of Image-to-Video Models via Adaptive Low-Pass Guidance | June Suk Choi et.al. | 2506.08456 | null |
2025-06-10 | Better Reasoning with Less Data: Enhancing VLMs Through Unified Modality Scoring | Mingjie Xu et.al. | 2506.08429 | null |
2025-06-10 | Image Demoiréing Using Dual Camera Fusion on Mobile Phones | Yanting Mei et.al. | 2506.08361 | link |
2025-06-10 | How Much To Guide: Revisiting Adaptive Guidance in Classifier-Free Guidance Text-to-Vision Diffusion Models | Huixuan Zhang et.al. | 2506.08351 | null |
2025-06-10 | Complex-Valued Holographic Radiance Fields | Yicheng Zhan et.al. | 2506.08350 | null |
2025-06-09 | High-density three-dimensional holography using rapid modulation of light | Jorge-Alberto Peralta-Ángeles et.al. | 2506.08253 | null |
2025-06-09 | Dreamland: Controllable World Creation with Simulator and Generative Models | Sicheng Mo et.al. | 2506.08006 | null |
2025-06-09 | Audio-Sync Video Generation with Multi-Stream Temporal Control | Shuchen Weng et.al. | 2506.08003 | null |
2025-06-09 | Squeeze3D: Your 3D Generation Model is Secretly an Extreme Neural Compressor | Rishit Dagli et.al. | 2506.07932 | null |
2025-06-09 | Efficient Seismic Data Interpolation via Sparse Attention Transformer and Diffusion Model | Xiaoli Wei et.al. | 2506.07923 | null |
2025-06-09 | Video Unlearning via Low-Rank Refusal Vector | Simone Facchiano et.al. | 2506.07891 | null |
2025-06-09 | Diffusion Counterfactual Generation with Semantic Abduction | Rajat Rasal et.al. | 2506.07883 | link |
2025-06-09 | M2Restore: Mixture-of-Experts-based Mamba-CNN Fusion Framework for All-in-One Image Restoration | Yongzhen Wang et.al. | 2506.07814 | null |
2025-06-09 | Research quality evaluation by AI in the era of Large Language Models: Advantages, disadvantages, and systemic effects | Mike Thelwall et.al. | 2506.07748 | null |
2025-06-09 | PIG: Physically-based Multi-Material Interaction with 3D Gaussians | Zeyu Xiao et.al. | 2506.07657 | null |
2025-06-09 | Information-guided optimization of image-based sensorless adaptive optics methods | Biwei Zhang et.al. | 2506.07482 | null |
2025-06-09 | Compressed Feature Quality Assessment: Dataset and Baselines | Changsheng Gao et.al. | 2506.07412 | null |
2025-06-09 | Distributed Image Semantic Communication via Nonlinear Transform Coding | Yufei Bo et.al. | 2506.07391 | null |
2025-06-08 | Multi-Step Guided Diffusion for Image Restoration on Edge Devices: Toward Lightweight Perception in Embodied AI | Aditya Chakravarty et.al. | 2506.07286 | null |
2025-06-08 | First positronium imaging using |
Manish Das et.al. | 2506.07230 | null |
2025-06-08 | Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models | Sangwon Jang et.al. | 2506.07177 | null |
2025-06-08 | Mathesis: Towards Formal Theorem Proving from Natural Languages | Yu Xuejun et.al. | 2506.07047 | null |
2025-06-08 | Deep regularization networks for inverse problems with noisy operators | Fatemeh Pourahmadian et.al. | 2506.07008 | null |
2025-06-07 | SPC to 3D: Novel View Synthesis from Binary SPC via I2I translation | Sumit Sharma et.al. | 2506.06890 | null |
2025-06-07 | Controllable Coupled Image Generation via Diffusion Models | Chenfei Yuan et.al. | 2506.06826 | null |
2025-06-07 | An Efficient Digital Watermarking Technique for Small Scale devices | Kaushik Talathi et.al. | 2506.06691 | null |
2025-06-06 | Bidirectional Image-Event Guided Low-Light Image Enhancement | Zhanwen Liu et.al. | 2506.06120 | null |
2025-06-06 | On Inverse Problems, Parameter Estimation, and Domain Generalization | Deborah Pereg et.al. | 2506.06024 | null |
2025-06-06 | Rethinking Semi-supervised Segmentation Beyond Accuracy: Reliability and Robustness | Steven Landgraf et.al. | 2506.05917 | null |
2025-06-05 | On-the-fly Reconstruction for Large-Scale Novel View Synthesis from Unposed Images | Andreas Meuleman et.al. | 2506.05558 | null |
2025-06-05 | EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh | Tao Hu et.al. | 2506.05554 | null |
2025-06-05 | F2T2-HiT: A U-Shaped FFT Transformer and Hierarchical Transformer for Reflection Removal | Jie Cai et.al. | 2506.05489 | null |
2025-06-05 | Implicit Neural Representation for Video Restoration | Mary Aiyetigbo et.al. | 2506.05488 | null |
2025-06-05 | Degradation-Aware Image Enhancement via Vision-Language Classification | Jie Cai et.al. | 2506.05450 | null |
2025-06-05 | SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training | Jianyi Wang et.al. | 2506.05301 | null |
2025-06-06 | Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers | Haosong Liu et.al. | 2506.05096 | null |
2025-06-05 | Robustness as Architecture: Designing IQA Models to Withstand Adversarial Perturbations | Igor Meleshin et.al. | 2506.04951 | null |
2025-06-05 | Time-Lapse Video-Based Embryo Grading via Complementary Spatial-Temporal Pattern Mining | Yong Sun et.al. | 2506.04950 | null |
2025-06-05 | Deep learning image burst stacking to reconstruct high-resolution ground-based solar observations | Christoph Schirninger et.al. | 2506.04781 | null |
2025-06-05 | Physics Informed Capsule Enhanced Variational AutoEncoder for Underwater Image Enhancement | Niki Martinel et.al. | 2506.04753 | null |
2025-06-05 | Towards Holistic Visual Quality Assessment of AI-Generated Videos: A LLM-Based Multi-Dimensional Evaluation Model | Zelu Qi et.al. | 2506.04715 | link |
2025-06-05 | StatsMerging: Statistics-Guided Model Merging via Task-Specific Teacher Distillation | Ranjith Merugu et.al. | 2506.04567 | link |
2025-06-04 | HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation | Hermann Kumbong et.al. | 2506.04421 | null |
2025-06-04 | Fine-Tuning Video Transformers for Word-Level Bangla Sign Language: A Comparative Analysis for Classification Tasks | Jubayer Ahmed Bhuiyan Shawon et.al. | 2506.04367 | null |
2025-06-04 | SSIMBaD: Sigma Scaling with SSIM-Guided Balanced Diffusion for AnimeFace Colorization | Junpyo Seo et.al. | 2506.04283 | link |
2025-06-04 | Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation | Tianyu Huang et.al. | 2506.04225 | null |
2025-06-04 | SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models | Yuhao Wu et.al. | 2506.04180 | null |
2025-06-04 | Synthetic multi-inversion time magnetic resonance images for visualization of subcortical structures | Savannah P. Hays et.al. | 2506.04173 | null |
2025-06-04 | Point Cloud Quality Assessment Using the Perceptual Clustering Weighted Graph (PCW-Graph) and Attention Fusion Network | Abdelouahed Laazoufi et.al. | 2506.04081 | null |
2025-06-04 | Collaborative On-Sensor Array Cameras | Jipeng Sun et.al. | 2506.04061 | null |
2025-06-04 | Joint Video Enhancement with Deblurring, Super-Resolution, and Frame Interpolation Network | Giyong Choi et.al. | 2506.03892 | null |
2025-06-04 | EuroGEST: Investigating gender stereotypes in multilingual language models | Jacqueline Rowe et.al. | 2506.03867 | null |
2025-06-04 | Conformer-based Ultrasound-to-Speech Conversion | Ibrahim Ibrahimov et.al. | 2506.03831 | link |
2025-06-04 | ControlThinker: Unveiling Latent Semantics for Controllable Image Generation through Visual Reasoning | Feng Han et.al. | 2506.03596 | link |
2025-06-04 | DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models | Ziyi Wu et.al. | 2506.03517 | null |
2025-06-03 | CamCloneMaster: Enabling Reference-based Camera Control for Video Generation | Yawen Luo et.al. | 2506.03140 | null |
2025-06-03 | DCM: Dual-Expert Consistency Model for Efficient and High-Quality Video Generation | Zhengyao Lv et.al. | 2506.03123 | null |
2025-06-03 | Astrophotography turbulence mitigation via generative models | Joonyeoup Kim et.al. | 2506.02981 | null |
2025-06-03 | IMPARA-GED: Grammatical Error Detection is Boosting Reference-free Grammatical Error Quality Estimator | Yusuke Sakai et.al. | 2506.02899 | null |
2025-06-03 | NTIRE 2025 XGC Quality Assessment Challenge: Methods and Results | Xiaohong Liu et.al. | 2506.02875 | null |
2025-06-03 | ControlMambaIR: Conditional Controls with State-Space Model for Image Restoration | Cheng Yang et.al. | 2506.02633 | null |
2025-06-03 | One-Step Diffusion-based Real-World Image Super-Resolution with Visual Perception Distillation | Xue Wu et.al. | 2506.02605 | null |
2025-06-03 | Multi-modal brain MRI synthesis based on SwinUNETR | Haowen Pang et.al. | 2506.02467 | null |
2025-06-02 | Motion aware video generative model | Bowen Xue et.al. | 2506.02244 | null |
2025-06-02 | SALF-MOS: Speaker Agnostic Latent Features Downsampled for MOS Prediction | Saurabh Agrawal et.al. | 2506.02082 | null |
2025-06-02 | IMAGHarmony: Controllable Image Editing with Consistent Object Quantity and Layout | Fei Shen et.al. | 2506.01949 | null |
2025-06-04 | MedEBench: Revisiting Text-instructed Image Editing on Medical Domain | Minghao Liu et.al. | 2506.01921 | null |
2025-06-02 | Beyond Pixel Agreement: Large Language Models as Clinical Guardrails for Reliable Medical Image Segmentation | Jiaxi Sheng et.al. | 2506.01841 | null |
2025-06-02 | WorldExplorer: Towards Generating Fully Navigable 3D Scenes | Manuel-Andreas Schneider et.al. | 2506.01799 | null |
2025-06-03 | Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability | Genta Indra Winata et.al. | 2506.01789 | link |
2025-06-02 | STORM: Benchmarking Visual Rating of MLLMs with a Comprehensive Ordinal Regression Dataset | Jinhong Wang et.al. | 2506.01738 | null |
2025-06-02 | Observation of the Crab Nebula with the Single-Mirror Small-Size Telescope stereoscopic system at low altitude | C. Alispach et.al. | 2506.01733 | null |
2025-06-02 | Self-Supervised Speech Quality Assessment (S3QA): Leveraging Speech Foundation Models for a Scalable Speech Quality Metric | Mattson Ogg et.al. | 2506.01655 | null |
2025-06-02 | Enhancing Diffusion-based Unrestricted Adversarial Attacks via Adversary Preferences Alignment | Kaixun Jiang et.al. | 2506.01511 | null |
2025-06-02 | Universal Preference-Score-based Pairwise Speech Quality Assessment | Yu-Fei Shi et.al. | 2506.01455 | null |
2025-05-30 | ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL | Yu Zhang et.al. | 2505.24875 | null |
2025-05-30 | LegalEval-Q: A New Benchmark for The Quality Evaluation of LLM-Generated Legal Text | Li yunhan et.al. | 2505.24826 | link |
2025-05-30 | Draw ALL Your Imagine: A Holistic Benchmark and Agent Framework for Complex Instruction-based Image Generation | Yucheng Zhou et.al. | 2505.24787 | link |
2025-05-30 | RT-X Net: RGB-Thermal cross attention network for Low-Light Image Enhancement | Raman Jha et.al. | 2505.24705 | link |
2025-05-30 | ARECHO: Autoregressive Evaluation via Chain-Based Hypothesis Optimization for Speech Multi-Metric Estimation | Jiatong Shi et.al. | 2505.24518 | null |
2025-05-30 | Digital twins enable full-reference quality assessment of photoacoustic image reconstructions | Janek Gröhl et.al. | 2505.24514 | null |
2025-05-30 | Reason-SVG: Hybrid Reward RL for Aha-Moments in Vector Graphics Generation | Ximing Xing et.al. | 2505.24499 | null |
2025-05-30 | VietMix: A Naturally Occurring Vietnamese-English Code-Mixed Corpus with Iterative Augmentation for Machine Translation | Hieu Tran et.al. | 2505.24472 | null |
2025-05-30 | EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering | Runnan Lu et.al. | 2505.24417 | link |
2025-05-30 | PCIE_Interaction Solution for Ego4D Social Interaction Challenge | Kanokphan Lertniphonphan et.al. | 2505.24404 | null |
2025-05-30 | Interactive Video Generation via Domain Adaptation | Ishaan Rawal et.al. | 2505.24253 | null |
2025-05-30 | STORK: Improving the Fidelity of Mid-NFE Sampling for Diffusion and Flow Matching Models | Zheng Tan et.al. | 2505.24210 | link |
2025-05-29 | DGIQA: Depth-guided Feature Attention and Refinement for Generalizable Image Quality Assessment | Vaishnav Ramesh et.al. | 2505.24002 | link |
2025-05-29 | DarkDiff: Advancing Low-Light Raw Enhancement by Retasking Diffusion Models for Camera ISP | Amber Yijia Zheng et.al. | 2505.23743 | null |
2025-05-29 | SenWiCh: Sense-Annotation of Low-Resource Languages for WiC using Hybrid Methods | Roksana Goworek et.al. | 2505.23714 | null |
2025-05-29 | Multilook Coherent Imaging: Theoretical Guarantees and Algorithms | Xi Chen et.al. | 2505.23594 | link |
2025-05-29 | R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation | Kaijie Chen et.al. | 2505.23493 | null |
2025-05-29 | VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality Evaluation | Shi-Xue Zhang et.al. | 2505.23484 | link |
2025-05-29 | Fine-Tuning Next-Scale Visual Autoregressive Models with Group Relative Policy Optimization | Matteo Gallici et.al. | 2505.23331 | null |
2025-05-29 | Quality assessment of 3D human animation: Subjective and objective evaluation | Rim Rekik et.al. | 2505.23301 | null |
2025-05-29 | UniTEX: Universal High Fidelity Generative Texturing for 3D Shapes | Yixun Liang et.al. | 2505.23253 | link |
2025-05-29 | Advancing Image Super-resolution Techniques in Remote Sensing: A Comprehensive Survey | Yunliang Qi et.al. | 2505.23248 | null |
2025-05-30 | WTEFNet: Real-Time Low-Light Object Detection for Advanced Driver Assistance Systems | Hao Wu et.al. | 2505.23201 | null |
2025-05-29 | Unsupervised Word-level Quality Estimation for Machine Translation Through the Lens of Annotators (Dis)agreement | Gabriele Sarti et.al. | 2505.23183 | link |
2025-05-29 | MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation | Siyuan Wang et.al. | 2505.23120 | link |
2025-05-29 | TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance | Keren Ye et.al. | 2505.23119 | null |
2025-05-28 | ATI: Any Trajectory Instruction for Controllable Video Generation | Angtian Wang et.al. | 2505.22944 | null |
2025-05-28 | Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape | Ruichen Chen et.al. | 2505.22918 | link |
2025-05-28 | Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation | Zhe Kong et.al. | 2505.22647 | link |
2025-05-28 | Scaling-up Perceptual Video Quality Assessment | Ziheng Jia et.al. | 2505.22543 | null |
2025-05-28 | PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models | Junwen Chen et.al. | 2505.22523 | null |
2025-05-28 | Understanding Adversarial Training with Energy-based Models | Mujtaba Hussain Mirza et.al. | 2505.22486 | null |
2025-05-28 | Large-Area Fabrication-aware Computational Diffractive Optics | Kaixuan Wei et.al. | 2505.22313 | null |
2025-05-28 | FaceEditTalker: Interactive Talking Head Generation with Facial Attribute Editing | Guanwen Feng et.al. | 2505.22141 | null |
2025-05-28 | Real-Time Blind Defocus Deblurring for Earth Observation: The IMAGIN-e Mission Approach | Alejandro D. Mousist et.al. | 2505.22128 | null |
2025-05-28 | SridBench: Benchmark of Scientific Research Illustration Drawing of Image Generation Model | Yifan Chang et.al. | 2505.22126 | null |
2025-05-28 | High Volume Rate 3D Ultrasound Reconstruction with Diffusion Models | Tristan S. W. Stevens et.al. | 2505.22090 | null |
2025-05-28 | AquaMonitor: A multimodal multi-view image sequence dataset for real-life aquatic invertebrate biodiversity monitoring | Mikko Impiö et.al. | 2505.22065 | null |
2025-05-28 | One-Way Ticket:Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models | Senmao Li et.al. | 2505.21960 | null |
2025-05-28 | Patch-based Reconstruction for Unsupervised Dynamic MRI using Learnable Tensor Function with Implicit Neural Representation | Yuanyuan Liu et.al. | 2505.21894 | null |
2025-05-28 | FPAN: Mitigating Replication in Diffusion Models through the Fine-Grained Probabilistic Addition of Noise to Token Embeddings | Jingqi Xu et.al. | 2505.21848 | null |
2025-05-27 | HDRSDR-VQA: A Subjective Video Quality Dataset for HDR and SDR Comparative Evaluation | Bowen Chen et.al. | 2505.21831 | null |
2025-05-29 | Rethinking Chunk Size For Long-Document Retrieval: A Multi-Dataset Analysis | Sinchana Ramakanth Bhat et.al. | 2505.21700 | link |
2025-05-27 | Generalizable and Relightable Gaussian Splatting for Human Novel View Synthesis | Yipengjing Sun et.al. | 2505.21502 | null |
2025-05-27 | Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers | Wei Pang et.al. | 2505.21497 | link |
2025-05-27 | Policy Optimized Text-to-Image Pipeline Design | Uri Gadot et.al. | 2505.21478 | null |
2025-05-27 | OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers | Ziqiao Peng et.al. | 2505.21448 | null |
2025-05-28 | Towards Robust Automated Perceptual Voice Quality Assessment with Speech Foundation Models | Whenty Ariyanti et.al. | 2505.21356 | null |
2025-05-27 | Model as Loss: A Self-Consistent Training Paradigm | Saisamarth Rajesh Phaye et.al. | 2505.21156 | null |
2025-05-27 | Conditional Diffusion Models with Classifier-Free Gibbs-like Guidance | Badr Moufad et.al. | 2505.21101 | link |
2025-05-27 | All-optical discrete illumination-based compressed ultrafast photography | Long Cheng et.al. | 2505.21086 | null |
2025-05-27 | Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals | Davide Lobba et.al. | 2505.21062 | link |
2025-05-27 | Proposal for the optical design of three robust and highly performing FPI systems for the European Solar Telescope | Goran B. Scharmer et.al. | 2505.21053 | null |
2025-05-28 | CityGo: Lightweight Urban Modeling and Rendering with Proxy Buildings and Residual Gaussians | Weihang Liu et.al. | 2505.21041 | null |
2025-05-27 | RainFusion: Adaptive Video Generation Acceleration via Multi-Dimensional Visual Redundancy | Aiyue Chen et.al. | 2505.21036 | null |
2025-05-27 | Generative Image Compression by Estimating Gradients of the Rate-variable Feature Distribution | Minghao Han et.al. | 2505.20984 | null |
2025-05-28 | Stereo Radargrammetry Using Deep Learning from Airborne SAR Images | Tatsuya Sasayama et.al. | 2505.20876 | null |
2025-05-27 | Uni-VERSA: Versatile Speech Assessment with a Unified Network | Jiatong Shi et.al. | 2505.20741 | null |
2025-05-27 | LeDiFlow: Learned Distribution-guided Flow Matching to Accelerate Image Generation | Pascal Zwick et.al. | 2505.20723 | link |
2025-05-27 | Photography Perspective Composition: Towards Aesthetic Perspective Recommendation | Lujian Yao et.al. | 2505.20655 | null |
2025-05-27 | InstGenIE: Generative Image Editing Made Efficient with Mask-aware Caching and Scheduling | Xiaoxiao Jiang et.al. | 2505.20600 | null |
2025-05-26 | MultLFG: Training-free Multi-LoRA composition using Frequency-domain Guidance | Aniket Roy et.al. | 2505.20525 | null |
2025-05-26 | In-Context Brush: Zero-shot Customized Subject Insertion with Context-Aware Latent Space Manipulation | Yu Xu et.al. | 2505.20271 | null |
2025-05-26 | Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion | Zheqi Lv et.al. | 2505.20053 | link |
2025-05-26 | ICDM: Interference Cancellation Diffusion Models for Wireless Semantic Communications | Tong Wu et.al. | 2505.19983 | null |
2025-05-26 | PHI: Bridging Domain Shift in Long-Term Action Quality Assessment via Progressive Hierarchical Instruction | Kanglei Zhou et.al. | 2505.19972 | link |
2025-05-27 | Dynamic-I2V: Exploring Image-to-Video Generation Models via Multimodal LLM | Peng Liu et.al. | 2505.19901 | null |
2025-05-26 | Navigating PESQ: Up-to-Date Versions and Open Implementations | Matteo Torcoli et.al. | 2505.19760 | link |
2025-05-26 | CIDRe: A Reference-Free Multi-Aspect Criterion for Code Comment Quality Measurement | Maria Dziuba et.al. | 2505.19757 | null |
2025-05-26 | HAODiff: Human-Aware One-Step Diffusion via Dual-Prompt Guidance | Jue Gong et.al. | 2505.19742 | link |
2025-05-26 | Modeling Beyond MOS: Quality Assessment Models Must Integrate Context, Reasoning, and Multimodality | Mohamed Amine Kerkouri et.al. | 2505.19696 | null |
2025-05-26 | DriveCamSim: Generalizable Camera Simulation via Explicit Camera Modeling for Autonomous Driving | Wenchao Sun et.al. | 2505.19692 | link |
2025-05-26 | Burst Image Super-Resolution via Multi-Cross Attention Encoding and Multi-Scan State-Space Decoding | Tengda Huang et.al. | 2505.19668 | null |
2025-05-26 | VTBench: Comprehensive Benchmark Suite Towards Real-World Virtual Try-on Models | Hu Xiaobin et.al. | 2505.19571 | link |
2025-05-26 | TDVE-Assessor: Benchmarking and Evaluating the Quality of Text-Driven Video Editing with LMMs | Juntong Wang et.al. | 2505.19535 | null |
2025-05-26 | Ring artifacts correction method in x-ray computed tomography based on stripe classification and removal in sinogram images | Yang Zou et.al. | 2505.19513 | null |
2025-05-26 | ViewCraft3D: High-Fidelity and View-Consistent 3D Vector Graphics Synthesis | Chuang Wang et.al. | 2505.19492 | null |
2025-05-26 | Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals | Nate Gillman et.al. | 2505.19386 | null |
2025-05-26 | Neural nanophotonic object detector with ultra-wide field-of-view | Ji Chen et.al. | 2505.19379 | null |
2025-05-25 | SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline | Helin Wang et.al. | 2505.19314 | link |
2025-05-25 | Step-level Reward for Free in RL-based T2I Diffusion Model Fine-tuning | Xinyao Liao et.al. | 2505.19196 | link |
2025-05-25 | Triangle Splatting for Real-Time Radiance Field Rendering | Jan Held et.al. | 2505.19175 | null |
2025-05-25 | MIND-Edit: MLLM Insight-Driven Editing via Language-Vision Projection | Shuyu Wang et.al. | 2505.19149 | null |
2025-05-23 | RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration | Sudarshan Rajagopalan et.al. | 2505.18047 | null |
2025-05-23 | DiffusionReward: Enhancing Blind Face Restoration through Reward Feedback Learning | Bin Wu et.al. | 2505.17910 | null |
2025-05-23 | U2-BENCH: Benchmarking Large Vision-Language Models on Ultrasound Understanding | Anjie Le et.al. | 2505.17779 | null |
2025-05-23 | SafeMVDrive: Multi-view Safety-Critical Driving Video Synthesis in the Real World Domain | Jiawei Zhou et.al. | 2505.17727 | null |
2025-05-23 | CAS-IQA: Teaching Vision-Language Models for Synthetic Angiography Quality Assessment | Bo Wang et.al. | 2505.17619 | null |
2025-05-23 | Model Already Knows the Best Noise: Bayesian Active Noise Selection via Attention in Video Diffusion Model | Kwanyoung Kim et.al. | 2505.17561 | null |
2025-05-23 | Deeper Diffusion Models Amplify Bias | Shahin Hakemi et.al. | 2505.17560 | null |
2025-05-23 | PD |
Dezheng Bao et.al. | 2505.17492 | null |
2025-05-23 | Dual Ascent Diffusion for Inverse Problems | Minseo Kim et.al. | 2505.17353 | null |
2025-05-22 | Mitigate One, Skew Another? Tackling Intersectional Biases in Text-to-Image Models | Pushkar Shukla et.al. | 2505.17280 | null |
2025-05-22 | GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning | Chengqi Duan et.al. | 2505.17022 | link |
2025-05-23 | Active Speech Enhancement: Active Speech Denoising Decliping and Deveraberation | Ofir Yaish et.al. | 2505.16911 | null |
2025-05-22 | Perceptual Quality Assessment for Embodied AI | Chunyi Li et.al. | 2505.16815 | link |
2025-05-22 | Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining | Shangquan Sun et.al. | 2505.16811 | null |
2025-05-22 | REPA Works Until It Doesn't: Early-Stopped, Holistic Alignment Supercharges Diffusion Training | Ziqiao Wang et.al. | 2505.16792 | link |
2025-05-22 | Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning | Jian Liu et.al. | 2505.16761 | null |
2025-05-22 | One-Step Diffusion-Based Image Compression with Semantic Distillation | Naifu Xue et.al. | 2505.16687 | null |
2025-05-22 | Performance of Objective Speech Quality Metrics on Languages Beyond Validation Data: A Study of Turkish and Korean | Javier Perez et.al. | 2505.16616 | null |
2025-05-22 | ScholarBench: A Bilingual Benchmark for Abstraction, Comprehension, and Reasoning Evaluation in Academic Contexts | Dongwon Noh et.al. | 2505.16566 | null |
2025-05-22 | SHaDe: Compact and Consistent Dynamic 3D Reconstruction via Tri-Plane Deformation and Latent Diffusion | Asrar Alruwayqi et.al. | 2505.16535 | null |
2025-05-22 | Utilizing citation index and synthetic quality measure to compare Wikipedia languages across various topics | Włodzimierz Lewoniewski et.al. | 2505.16506 | null |
2025-05-22 | InspectionV3: Enhancing Tobacco Quality Assessment with Deep Convolutional Neural Networks for Automated Workshop Management | Yao Wei et.al. | 2505.16485 | null |
2025-05-22 | UBGAN: Enhancing Coded Speech with Blind and Guided Bandwidth Extension | Kishan Gupta et.al. | 2505.16404 | null |
2025-05-22 | FPQVAR: Floating Point Quantization for Visual Autoregressive Model with FPGA Hardware Co-design | Renjie Wei et.al. | 2505.16335 | link |
2025-05-22 | NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment | Shuhao Han et.al. | 2505.16314 | null |
2025-05-22 | A Shape-Aware Total Body Photography System for In-focus Surface Coverage Optimization | Wei-Lun Huang et.al. | 2505.16228 | null |
2025-05-22 | Generative Latent Coding for Ultra-Low Bitrate Image and Video Compression | Linfeng Qi et.al. | 2505.16177 | null |
2025-05-22 | LLMs Are Not Scorers: Rethinking MT Evaluation with Generation-Based Methods | Hyang Cui et.al. | 2505.16129 | link |
2025-05-22 | OSCAR: One-Step Diffusion Codec Across Multiple Bit-rates | Jinpei Guo et.al. | 2505.16091 | link |
2025-05-21 | CP-LLM: Context and Pixel Aware Large Language Model for Video Quality Assessment | Wen Wen et.al. | 2505.16025 | null |
2025-05-21 | Leveraging the Powerful Attention of a Pre-trained Diffusion Model for Exemplar-based Image Colorization | Satoshi Kosugi et.al. | 2505.15812 | link |
2025-05-21 | RUSplatting: Robust 3D Gaussian Splatting for Sparse-View Underwater Scene Reconstruction | Zhuodong Jiang et.al. | 2505.15737 | null |
2025-05-21 | Guidelines for the Quality Assessment of Energy-Aware NAS Benchmarks | Nick Kocher et.al. | 2505.15631 | null |
2025-05-21 | Accelerating Autoregressive Speech Synthesis Inference With Speech Speculative Decoding | Zijian Lin et.al. | 2505.15380 | null |
2025-05-21 | SHEET: A Multi-purpose Open-source Speech Human Evaluation Estimation Toolkit | Wen-Chin Huang et.al. | 2505.15061 | link |
2025-05-20 | Training-Free Watermarking for Autoregressive Image Generation | Yu Tong et.al. | 2505.14673 | link |
2025-05-20 | Neural Inverse Scattering with Score-based Regularization | Yuan Gao et.al. | 2505.14560 | null |
2025-05-20 | Automated, Cross-Layer Root Cause Analysis of 5G Video-Conferencing Quality Degradation | Fan Yi et.al. | 2505.14540 | null |
2025-05-20 | VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank | Tianhe Wu et.al. | 2505.14460 | link |
2025-05-20 | Accuracy and Fairness of Facial Recognition Technology in Low-Quality Police Images: An Experiment With Synthetic Faces | Maria Cuellar et.al. | 2505.14320 | null |
2025-05-20 | Towards Generating Realistic Underwater Images | Abdul-Kazeem Shamba et.al. | 2505.14296 | null |
2025-05-20 | High-energy X-ray phase-contrast CT of an adult human chest phantom | Jannis N. Ahlers et.al. | 2505.14075 | null |
2025-05-21 | Recreating Neural Activity During Speech Production with Language and Speech Model Embeddings | Owais Mujtaba Khanday et.al. | 2505.14074 | link |
2025-05-20 | OmniStyle: Filtering High Quality Style Transfer Data at Scale | Ye Wang et.al. | 2505.14028 | null |
2025-05-20 | RLVR-World: Training World Models with Reinforcement Learning | Jialong Wu et.al. | 2505.13934 | link |
2025-05-20 | Automated Quality Evaluation of Cervical Cytopathology Whole Slide Images Based on Content Analysis | Lanlan Kang et.al. | 2505.13875 | null |
2025-05-20 | Exploring Image Quality Assessment from a New Perspective: Pupil Size | Yixuan Gao et.al. | 2505.13841 | null |
2025-05-19 | Learning Wavelet-Sparse FDK for 3D Cone-Beam CT Reconstruction | Yipeng Sun et.al. | 2505.13579 | null |
2025-05-19 | Neural-Enhanced Rate Adaptation and Computation Distribution for Emerging mmWave Multi-User 3D Video Streaming Systems | Babak Badnava et.al. | 2505.13337 | null |
2025-05-19 | Seeing the Unseen: How EMoE Unveils Bias in Text-to-Image Diffusion Models | Lucas Berry et.al. | 2505.13273 | null |
2025-05-19 | Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene Representation | Seungjun Oh et.al. | 2505.13215 | link |
2025-05-19 | Combinatorial Sample-and Back-Focal-Plane (BFP) Imaging. Pt. I: Instrument and acquisition parameters affecting BFP images and their analysis | Omer Shavit et.al. | 2505.13190 | null |
2025-05-19 | Higher fidelity perceptual image and video compression with a latent conditioned residual denoising diffusion model | Jonas Brenig et.al. | 2505.13152 | link |
2025-05-19 | ARIW-Framework: Adaptive Robust Iterative Watermarking Framework | Shaowu Wu et.al. | 2505.13101 | null |
2025-05-19 | Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking | Zihan Su et.al. | 2505.12667 | null |
2025-05-18 | DPCD: A Quality Assessment Database for Dynamic Point Clouds | Yating Liu et.al. | 2505.12431 | null |
2025-05-18 | VoiceCloak: A Multi-Dimensional Defense Framework against Unauthorized Diffusion-based Voice Cloning | Qianyue Hu et.al. | 2505.12332 | null |
2025-05-18 | Beyond Single-Point Judgment: Distribution Alignment for LLM-as-a-Judge | Luyu Chen et.al. | 2505.12301 | null |
2025-05-18 | Can Large Multimodal Models Understand Agricultural Scenes? Benchmarking with AgroMind | Qingmei Li et.al. | 2505.12207 | null |
2025-05-18 | CTLformer: A Hybrid Denoising Model Combining Convolutional Layers and Self-Attention for Enhanced CT Image Reconstruction | Zhiting Zheng et.al. | 2505.12203 | null |
2025-05-17 | LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation | Jiarui Wang et.al. | 2505.12098 | link |
2025-05-17 | Accelerating Diffusion-based Super-Resolution with Dynamic Time-Spatial Sampling | Rui Qin et.al. | 2505.12048 | null |
2025-05-17 | Towards Comprehensive Argument Analysis in Education: Dataset, Tasks, and Method | Yupei Ren et.al. | 2505.12028 | null |
2025-05-17 | BINAQUAL: A Full-Reference Objective Localization Similarity Metric for Binaural Audio | Davoud Shariat Panah et.al. | 2505.11915 | link |
2025-05-16 | Semantically-Aware Game Image Quality Assessment | Kai Zhu et.al. | 2505.11724 | null |
2025-05-16 | Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration | Haipeng Fang et.al. | 2505.11707 | null |
2025-05-16 | No Gold Standard, No Problem: Reference-Free Evaluation of Taxonomies | Pascal Wullschleger et.al. | 2505.11470 | null |
2025-05-16 | Entropy-Driven Genetic Optimization for Deep-Feature-Guided Low-Light Image Enhancement | Nirjhor Datta et.al. | 2505.11246 | link |
2025-05-16 | DiCo: Revitalizing ConvNets for Scalable and Efficient Diffusion Modeling | Yuang Ai et.al. | 2505.11196 | link |
2025-05-16 | Controlling spatial correlation in k-space interpolation networks for MRI reconstruction: denoising versus apparent blurring | Istvan Homolya et.al. | 2505.11155 | null |
2025-05-16 | Pseudo-Label Quality Decoupling and Correction for Semi-Supervised Instance Segmentation | Jianghang Lin et.al. | 2505.11075 | null |
2025-05-16 | Shackled Dancing: A Bit-Locked Diffusion Algorithm for Lossless and Controllable Image Steganography | Tianshuo Zhang et.al. | 2505.10950 | null |
2025-05-16 | ToDMA: Large Model-Driven Token-Domain Multiple Access for Semantic Communications | Li Qiao et.al. | 2505.10946 | null |
2025-05-16 | Textured mesh Quality Assessment using Geometry and Color Field Similarity | Kaifa Yang et.al. | 2505.10824 | link |
2025-05-14 | Bias and Generalizability of Foundation Models across Datasets in Breast Mammography | Germani Elodie et.al. | 2505.10579 | null |
2025-05-15 | Learned Lightweight Smartphone ISP with Unpaired Data | Andrei Arhire et.al. | 2505.10420 | link |
2025-05-15 | Visual Fidelity Index for Generative Semantic Communications with Critical Information Embedding | Jianhao Huang et.al. | 2505.10405 | null |
2025-05-15 | Comparative Analysis of Richardson-Lucy Deconvolution and Data Unfolding with Mean Integrated Square Error Optimization | Nikolay D. Gagunashvili et.al. | 2505.10283 | null |
2025-05-15 | Ordered-subsets Multi-diffusion Model for Sparse-view CT Reconstruction | Pengfei Yu et.al. | 2505.09985 | null |
2025-05-14 | Conformal Bounds on Full-Reference Image Quality for Imaging Inverse Problems | Jeffrey Wen et.al. | 2505.09528 | link |
2025-05-14 | PDE: Gene Effect Inspired Parameter Dynamic Evolution for Low-light Image Enhancement | Tong Li et.al. | 2505.09196 | null |
2025-05-14 | TopoDiT-3D: Topology-Aware Diffusion Transformer with Bottleneck Structure for 3D Point Cloud Generation | Zechao Guan et.al. | 2505.09140 | link |
2025-05-13 | Learning Cocoercive Conservative Denoisers via Helmholtz Decomposition for Poisson Inverse Problems | Deliang Wei et.al. | 2505.08909 | null |
2025-05-12 | Towards SFW sampling for diffusion models via external conditioning | Camilo Carvajal Reyes et.al. | 2505.08817 | link |
2025-05-14 | WaveGuard: Robust Deepfake Detection and Source Tracing via Dual-Tree Complex Wavelet and Graph Neural Networks | Ziyuan He et.al. | 2505.08614 | link |
2025-05-14 | The RaspGrade Dataset: Towards Automatic Raspberry Ripeness Grading with Deep Learning | Mohamed Lamine Mekhalfi et.al. | 2505.08537 | null |
2025-05-13 | A Deep Learning-Driven Framework for Inhalation Injury Grading Using Bronchoscopy Images | Yifan Li et.al. | 2505.08517 | null |
2025-05-13 | The Evolutionary Map of the Universe: A new radio atlas for the southern hemisphere sky | A. M. Hopkins et.al. | 2505.08271 | null |
2025-05-13 | LLM-Based Detection of Tangled Code Changes for Higher-Quality Method-Level Bug Datasets | Md Nahidul Islam Opu et.al. | 2505.08263 | null |
2025-05-13 | Removing Watermarks with Partial Regeneration using Semantic Information | Krti Tallam et.al. | 2505.08234 | link |
2025-05-13 | Behind the Noise: Conformal Quantile Regression Reveals Emergent Representations | Petrus H. Zwart et.al. | 2505.08176 | null |
2025-05-12 | Asymptotically Efficient Data-adaptive Penalized Shrinkage Estimation with Application to Causal Inference | Herbert P. Susmann et.al. | 2505.08065 | link |
2025-05-12 | Multimodal Assessment of Classroom Discourse Quality: A Text-Centered Attention-Based Multi-Task Learning Approach | Ruikun Hou et.al. | 2505.07902 | null |
2025-05-12 | PtyRAD: A High-performance and Flexible Ptychographic Reconstruction Framework with Automatic Differentiation | Chia-Hao Lee et.al. | 2505.07814 | link |
2025-05-12 | Image Restoration via Integration of Optimal Control Techniques and the Hamilton-Jacobi-Bellman Equation | Dragos-Patru Covei et.al. | 2505.07699 | null |
2025-05-12 | A Case Study Investigating the Role of Generative AI in Quality Evaluations of Epics in Agile Software Development | Werner Geyer et.al. | 2505.07664 | null |
2025-05-12 | Addressing degeneracies in latent interpolation for diffusion models | Erik Landolsi et.al. | 2505.07481 | null |
2025-05-13 | Ophora: A Large-Scale Data-Driven Text-Guided Ophthalmic Surgical Video Generation Model | Wei Li et.al. | 2505.07449 | link |
2025-05-12 | Few-shot Semantic Encoding and Decoding for Video Surveillance | Baoping Cheng et.al. | 2505.07381 | null |
2025-05-12 | Synthetic Code Surgery: Repairing Bugs and Vulnerabilities with LLMs and Synthetic Data | David de-Fitero-Dominguez et.al. | 2505.07372 | null |
2025-05-12 | GAN-based synthetic FDG PET images from T1 brain MRI can serve to improve performance of deep unsupervised anomaly detection models | Daria Zotova et.al. | 2505.07364 | null |
2025-05-12 | Synthetic Similarity Search in Automotive Production | Christoph Huber et.al. | 2505.07256 | null |
2025-05-12 | Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding | Dianwen Ng et.al. | 2505.07235 | link |
2025-05-12 | Metrics that matter: Evaluating image quality metrics for medical image generation | Yash Deo et.al. | 2505.07175 | link |
2025-05-11 | Semantic-Guided Diffusion Model for Single-Step Image Super-Resolution | Zihang Liu et.al. | 2505.07071 | link |
2025-05-11 | DAPE: Dual-Stage Parameter-Efficient Fine-Tuning for Consistent Video Editing with Diffusion Models | Junhao Xia et.al. | 2505.07057 | null |
2025-05-11 | Replay-Based Continual Learning with Dual-Layered Distillation and a Streamlined U-Net for Efficient Text-to-Image Generation | Md. Naimur Asif Borno et.al. | 2505.06995 | null |
2025-05-10 | Tuning Butterworth filter's parameters in SPECT reconstructions via kernel-based Bayesian optimization with a no-reference image evaluation metric | Luca Pastrello et.al. | 2505.06692 | null |
2025-05-10 | MultiTaskVIF: Segmentation-oriented visible and infrared image fusion via multi-task learning | Zixian Zhao et.al. | 2505.06665 | null |
2025-05-10 | Two-Stage Random Alternation Framework for Zero-Shot Pansharpening | Haorui Chen et.al. | 2505.06576 | null |
2025-05-10 | Good Things Come in Pairs: Paired Autoencoders for Inverse Problems | Matthias Chung et.al. | 2505.06549 | null |
2025-05-10 | HDGlyph: A Hierarchical Disentangled Glyph-Based Framework for Long-Tail Text Rendering in Diffusion Models | Shuhan Zhuang et.al. | 2505.06543 | null |
2025-05-10 | Virtualized 3D Gaussians: Flexible Cluster-based Level-of-Detail System for Real-Time Rendering of Composed Scenes | Xijie Yang et.al. | 2505.06523 | null |
2025-05-09 | Towards Facial Image Compression with Consistency Preserving Diffusion Prior | Yimin Zhou et.al. | 2505.05870 | null |
2025-05-09 | PICD: Versatile Perceptual Image Compression with Diffusion Rendering | Tongda Xu et.al. | 2505.05853 | null |
2025-05-09 | Towards order of magnitude X-ray dose reduction in breast cancer imaging using phase contrast and deep denoising | Ashkan Pakzad et.al. | 2505.05812 | link |
2025-05-09 | Hybrid Learning: A Novel Combination of Self-Supervised and Supervised Learning for MRI Reconstruction without High-Quality Training Reference | Haoyang Pei et.al. | 2505.05703 | null |
2025-05-08 | A New k-Space Model for Non-Cartesian Fourier Imaging | Chin-Cheng Chan et.al. | 2505.05647 | null |
2025-05-08 | Score-based Self-supervised MRI Denoising | Jiachen Tu et.al. | 2505.05631 | null |
2025-05-08 | OXSeg: Multidimensional attention UNet-based lip segmentation using semi-supervised lip contours | Hanie Moghaddasi et.al. | 2505.05531 | null |
2025-05-08 | Flow-GRPO: Training Flow Matching Models via Online RL | Jie Liu et.al. | 2505.05470 | link |
2025-05-08 | A Study on Improvement of Image Quality in Quantum Polarized Microscopy using an Entangled-Photon Source | Mousume Samad et.al. | 2505.05457 | null |
2025-05-09 | LiTransProQA: an LLM-based Literary Translation evaluation metric with Professional Question Answering | Ran Zhang et.al. | 2505.05423 | link |
2025-05-08 | EAM: Enhancing Anything with Diffusion Transformers for Blind Super-Resolution | Haizhen Xie et.al. | 2505.05209 | null |
2025-05-08 | PaniCar: Securing the Perception of Advanced Driving Assistance Systems Against Emergency Vehicle Lighting | Elad Feldman et.al. | 2505.05183 | null |
2025-05-08 | MDE-Edit: Masked Dual-Editing for Multi-Object Image Editing via Diffusion Models | Hongyang Zhu et.al. | 2505.05101 | null |
2025-05-08 | MoRe-3DGSMR: Motion-resolved reconstruction framework for free-breathing pulmonary MRI based on 3D Gaussian representation | Tengya Peng et.al. | 2505.04959 | null |
2025-05-07 | Assessing Suburban Air Quality Constraints on Free Cooling in an Irish City | Paul D. O Sullivan et.al. | 2505.04746 | null |
2025-05-07 | Securing Immersive 360 Video Streams through Attribute-Based Selective Encryption | Mohammad Waquas Usmani et.al. | 2505.04466 | null |
2025-05-06 | dfreproject: A Python package for astronomical reprojection | Carter Lee Rhea et.al. | 2505.03932 | link |
2025-05-06 | DISARM++: Beyond scanner-free harmonization | Luca Caldera et.al. | 2505.03715 | link |
2025-05-06 | Towards Smart Point-and-Shoot Photography | Jiawan Li et.al. | 2505.03638 | null |
2025-05-07 | Breaking Annotation Barriers: Generalized Video Quality Assessment via Ranking-based Self-Supervision | Linhan Cao et.al. | 2505.03631 | link |
2025-05-07 | PAHA: Parts-Aware Audio-Driven Human Animation with Diffusion Model | Y. B. Wang et.al. | 2505.03603 | null |
2025-05-06 | Real-Time Person Image Synthesis Using a Flow Matching Model | Jiwoo Jeong et.al. | 2505.03562 | link |
2025-05-06 | MRI motion correction via efficient residual-guided denoising diffusion probabilistic models | Mojtaba Safari et.al. | 2505.03498 | null |
2025-05-06 | EOPose : Exemplar-based object reposing using Generalized Pose Correspondences | Sarthak Mehrotra et.al. | 2505.03394 | null |
2025-05-06 | DiffVQA: Video Quality Assessment Using Diffusion Feature Extractor | Wei-Ting Chen et.al. | 2505.03261 | null |
2025-05-06 | Is AI currently capable of identifying wild oysters? A comparison of human annotators against the AI model, ODYSSEE | Brendan Campbell et.al. | 2505.03108 | null |
2025-05-05 | NTIRE 2025 Challenge on UGC Video Enhancement: Methods and Results | Nikolay Safonov et.al. | 2505.03007 | link |
2025-05-05 | MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing | Zinan Guo et.al. | 2505.02823 | link |
2025-05-08 | Advances in Automated Fetal Brain MRI Segmentation and Biometry: Insights from the FeTA 2024 Challenge | Vladyslav Zalevskyi et.al. | 2505.02784 | null |
2025-05-05 | DeepSparse: A Foundation Model for Sparse-View CBCT Reconstruction | Yiqun Lin et.al. | 2505.02628 | null |
2025-05-05 | Deep learning of personalized priors from past MRI scans enables fast, quality-enhanced point-of-care MRI with low-cost systems | Tal Oved et.al. | 2505.02470 | null |
2025-05-05 | MSFNet-CPD: Multi-Scale Cross-Modal Fusion Network for Crop Pest Detection | Jiaqi Zhang et.al. | 2505.02441 | link |
2025-05-04 | Saliency-Guided Training for Fingerprint Presentation Attack Detection | Samuel Webster et.al. | 2505.02176 | null |
2025-05-04 | HiLLIE: Human-in-the-Loop Training for Low-Light Image Enhancement | Xiaorui Zhao et.al. | 2505.02134 | null |
2025-05-06 | Regression is all you need for medical image translation | Sebastian Rassmann et.al. | 2505.02048 | link |
2025-05-04 | Hybrid Image Resolution Quality Metric (HIRQM):A Comprehensive Perceptual Image Quality Assessment Framework | Vineesh Kumar Reddy Mondem et.al. | 2505.02001 | null |
2025-05-03 | GenSync: A Generalized Talking Head Framework for Audio-driven Multi-Subject Lip-Sync using 3D Gaussian Splatting | Anushka Agarwal et.al. | 2505.01928 | null |
2025-05-03 | ResiTok: A Resilient Tokenization-Enabled Framework for Ultra-Low-Rate and Robust Image Transmission | Zhenyu Liu et.al. | 2505.01870 | null |
2025-05-03 | Continuous Filtered Backprojection by Learnable Interpolation Network | Hui Lin et.al. | 2505.01768 | null |
2025-05-03 | Same evaluation, more tokens: On the effect of input length for machine translation evaluation using Large Language Models | Tobias Domhan et.al. | 2505.01761 | null |
2025-05-03 | Automated ARAT Scoring Using Multimodal Video Analysis, Multi-View Fusion, and Hierarchical Bayesian Models: A Clinician Study | Tamim Ahmed et.al. | 2505.01680 | null |
2025-05-02 | VIDSTAMP: A Temporally-Aware Watermark for Ownership and Integrity in Video Diffusion Models | Mohammadreza Teymoorianfard et.al. | 2505.01406 | link |
2025-05-02 | Potential Contrast: Properties, Equivalences, and Generalization to Multiple Classes | Wallace Peaslee et.al. | 2505.01388 | link |
2025-05-02 | FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis | Jiangtong Tan et.al. | 2505.01172 | link |
2025-05-02 | VSC: Visual Search Compositional Text-to-Image Diffusion Model | Do Huu Dat et.al. | 2505.01104 | null |
2025-05-02 | Diffuse Optical Ptychography | Mingwei He et.al. | 2505.01090 | null |
2025-05-02 | Enhancing Realism in Holographic Augmented Reality Displays through Occlusion Handling | Woongseob Han et.al. | 2505.00942 | null |
2025-05-01 | The Comparability of Model Fusion to Measured Data in Confuser Rejection | Conor Flynn et.al. | 2505.00836 | null |
2025-05-01 | GuideSR: Rethinking Guidance for One-Step High-Fidelity Diffusion-Based Super-Resolution | Aditya Arora et.al. | 2505.00687 | null |
2025-05-01 | Deep Learning Assisted Outer Volume Removal for Highly-Accelerated Real-Time Dynamic MRI | Merve Gülle et.al. | 2505.00643 | null |
2025-05-01 | KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution | Antoni Bigata et.al. | 2505.00497 | null |
2025-05-01 | Self-supervised surface-related multiple suppression with multidimensional convolution | Shijun Cheng et.al. | 2505.00419 | null |
2025-05-01 | Efficient Neural Video Representation with Temporally Coherent Modulation | Seungjun Shin et.al. | 2505.00335 | null |
2025-05-01 | Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution | Luigi Sigillo et.al. | 2505.00334 | null |
2025-05-01 | AI-Assisted Decision-Making for Clinical Assessment of Auto-Segmented Contour Quality | Biling Wang et.al. | 2505.00308 | null |
2025-04-30 | Efficient and robust 3D blind harmonization for large domain gaps | Hwihun Jeong et.al. | 2505.00133 | null |
2025-04-30 | From Aesthetics to Human Preferences: Comparative Perspectives of Evaluating Text-to-Music Systems | Huan Zhang et.al. | 2504.21815 | null |
2025-04-30 | Anatomical Similarity as a New Metric to Evaluate Brain Generative Models | Bahram Jafrasteh et.al. | 2504.21771 | null |
2025-04-30 | Visual Text Processing: A Comprehensive Review and Unified Evaluation | Yan Shu et.al. | 2504.21682 | link |
2025-04-30 | Diffusion-based Adversarial Identity Manipulation for Facial Privacy Protection | Liqin Wang et.al. | 2504.21646 | null |
2025-04-30 | RDF-Based Structured Quality Assessment Representation of Multilingual LLM Evaluations | Jonas Gwozdz et.al. | 2504.21605 | null |
2025-04-30 | Latent Feature-Guided Conditional Diffusion for High-Fidelity Generative Image Semantic Communication | Zehao Chen et.al. | 2504.21577 | null |
2025-04-30 | The First Theoretical Approximation Guarantees for the Non-Dominated Sorting Genetic Algorithm III (NSGA-III) | Renzhong Deng et.al. | 2504.21552 | null |
2025-04-30 | Impairments are Clustered in Latents of Deep Neural Network-based Speech Quality Models | Fredrik Cumlin et.al. | 2504.21528 | link |
2025-04-30 | Simple Visual Artifact Detection in Sora-Generated Videos | Misora Sugiyama et.al. | 2504.21334 | null |
2025-04-30 | Redundancy Analysis and Mitigation for Machine Learning-Based Process Monitoring of Additive Manufacturing | Jiarui Xie et.al. | 2504.21317 | null |
2025-04-30 | AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images | Yunhao Li et.al. | 2504.21308 | null |
2025-04-30 | High-Fidelity Single-Pixel Imaging at Ultra-Low Sampling Ratios via Physically Enhanced Laguerre Gaussian Encoding | JunYi Xiong et.al. | 2504.21290 | null |
2025-04-29 | GauSS-MI: Gaussian Splatting Shannon Mutual Information for Active 3D Reconstruction | Yuhan Xie et.al. | 2504.21067 | link |
2025-04-29 | YoChameleon: Personalized Vision and Language Generation | Thao Nguyen et.al. | 2504.20998 | null |
2025-04-29 | All-dielectric metasurface polarization scrambler for imaging applications | Edith Hartmann et.al. | 2504.20727 | null |
2025-04-29 | Histogram-Probabilistic Multi-Hypothesis Tracking with Integrated Target Existence | Lukas Herrmann et.al. | 2504.20526 | null |
2025-04-29 | LMM4Gen3DHF: Benchmarking and Evaluating Multimodal 3D Human Face Generation with LMMs | Woo Yi Yang et.al. | 2504.20466 | null |
2025-04-29 | APG-MOS: Auditory Perception Guided-MOS Predictor for Synthetic Speech | Zhicheng Lian et.al. | 2504.20447 | null |
2025-04-28 | CompleteMe: Reference-based Human Image Completion | Yu-Ju Tsai et.al. | 2504.20042 | null |
2025-04-28 | Enhancing Quality for VVC Compressed Videos with Omniscient Quality Enhancement Model | Xiem HoangVan et.al. | 2504.19935 | null |
2025-04-28 | Accelerated 3D-3D rigid registration of echocardiographic images obtained from apical window using particle filter | Thanuja Uruththirakodeeswaran et.al. | 2504.19930 | null |
2025-04-28 | Prompt Guiding Multi-Scale Adaptive Sparse Representation-driven Network for Low-Dose CT MAR | Baoshun Shi et.al. | 2504.19687 | null |
2025-04-27 | HumMorph: Generalized Dynamic Human Neural Fields from Few Views | Jakub Zadrożny et.al. | 2504.19390 | null |
2025-04-27 | Machine Learning-Based Modeling of the Anode Heel Effect in X-ray Beam Monte Carlo Simulations | Hussein Harb et.al. | 2504.19155 | null |
2025-04-26 | Calibrating Translation Decoding with Quality Estimation on LLMs | Di Wu et.al. | 2504.19044 | link |
2025-04-26 | REED-VAE: RE-Encode Decode Training for Iterative Image Editing with Diffusion Models | Gal Almog et.al. | 2504.18989 | link |
2025-04-26 | Audio-Driven Talking Face Video Generation with Joint Uncertainty Learning | Yifan Xie et.al. | 2504.18810 | null |
2025-04-25 | Augmenting Perceptual Super-Resolution via Image Quality Predictors | Fengjia Zhang et.al. | 2504.18524 | null |
2025-04-25 | NoiseController: Towards Consistent Multi-view Video Generation via Noise Decomposition and Collaboration | Haotian Dong et.al. | 2504.18448 | null |
2025-04-25 | Disentangle Identity, Cooperate Emotion: Correlation-Aware Emotional Talking Portrait Generation | Weipeng Tan et.al. | 2504.18087 | null |
2025-04-28 | From Cluster to Desktop: A Cache-Accelerated INR framework for Interactive Visualization of Tera-Scale Data | Daniel Zavorotny et.al. | 2504.18001 | null |
2025-04-23 | Learning Underwater Active Perception in Simulation | Alexandre Cardaillac et.al. | 2504.17817 | null |
2025-04-24 | Self-Supervised Noise Adaptive MRI Denoising via Repetition to Repetition (Rep2Rep) Learning | Nikola Janjušević et.al. | 2504.17698 | null |
2025-04-24 | Tamper-evident Image using JPEG Fixed Points | Zhaofeng Si et.al. | 2504.17594 | null |
2025-04-24 | ESDiff: Encoding Strategy-inspired Diffusion Model with Few-shot Learning for Color Image Inpainting | Junyan Zhang et.al. | 2504.17524 | null |
2025-04-24 | Inverse-Designed Metasurfaces for Wavefront Restoration in Under-Display Camera Systems | Jaegang Jo et.al. | 2504.17368 | null |
2025-04-24 | Scene Perceived Image Perceptual Score (SPIPS): combining global and local perception for image quality assessment | Zhiqiang Lao et.al. | 2504.17234 | null |
2025-04-23 | Distilling semantically aware orders for autoregressive image generation | Rishav Pramanik et.al. | 2504.17069 | null |
2025-04-23 | Diffusion Probabilistic Models for Compressive SAR Imaging | Odysseas Pappas et.al. | 2504.17053 | null |
2025-04-23 | Dense Air Pollution Estimation from Sparse in-situ Measurements and Satellite Data | Ruben Gonzalez Avilés et.al. | 2504.17039 | null |
2025-04-22 | TVC: Tokenized Video Compression with Ultra-Low Bitrate | Lebin Zhou et.al. | 2504.16953 | null |
2025-04-23 | Beyond Anonymization: Object Scrubbing for Privacy-Preserving 2D and 3D Vision Tasks | Murat Bilgehan Ertan et.al. | 2504.16557 | null |
2025-04-23 | ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance | Ying Li et.al. | 2504.16464 | null |
2025-04-23 | VideoMark: A Distortion-Free Robust Watermarking Framework for Video Diffusion Models | Xuming Hu et.al. | 2504.16359 | null |
2025-04-23 | Property-Preserving Hashing for |
Hassan Asghar et.al. | 2504.16355 | null |
2025-04-22 | Analytic Fourier ptychotomography for volumetric refractive index imaging | Zhenyu Dong et.al. | 2504.16247 | link |
2025-04-22 | Survey of Video Diffusion Models: Foundations, Implementations, and Applications | Yimu Wang et.al. | 2504.16081 | link |
2025-04-22 | From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning | Le Zhuo et.al. | 2504.16080 | null |
2025-04-22 | Boosting Generative Image Modeling via Joint Image-Feature Synthesis | Theodoros Kouzelis et.al. | 2504.16064 | null |
2025-04-22 | MVQA: Mamba with Unified Sampling for Efficient Video Quality Assessment | Yachun Mi et.al. | 2504.16003 | null |
2025-04-22 | DSDNet: Raw Domain Demoiréing via Dual Color-Space Synergy | Qirui Yang et.al. | 2504.15756 | null |
2025-04-22 | You Sense Only Once Beneath: Ultra-Light Real-Time Underwater Object Detection | Jun Dong et.al. | 2504.15694 | null |
2025-04-23 | Iris: A Next Generation Digital Pathology Rendering Engine | Ryan Erik Landvater et.al. | 2504.15437 | link |
2025-04-21 | Plug-and-Play Versatile Compressed Video Enhancement | Huimin Zeng et.al. | 2504.15380 | null |
2025-04-20 | Enhancing DR Classification with Swin Transformer and Shifted Window Attention | Meher Boulaabi et.al. | 2504.15317 | null |
2025-04-21 | StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians | Cailin Zhuang et.al. | 2504.15281 | null |
2025-04-21 | Tiger200K: Manually Curated High Visual Quality Video Dataset from UGC Platform | Xianpan Zhou et.al. | 2504.15182 | null |
2025-04-21 | Acquire and then Adapt: Squeezing out Text-to-Image Model for Image Restoration | Junyuan Deng et.al. | 2504.15159 | null |
2025-04-21 | Structure-guided Diffusion Transformer for Low-Light Image Enhancement | Xiangchen Yin et.al. | 2504.15054 | null |
2025-04-21 | NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: KwaiSR Dataset and Study | Xin Li et.al. | 2504.15003 | null |
2025-04-21 | K-DRIFT Preparation: Experimental Verification of an Observation Strategy for Accurate Dark-Sky Flats | Woowon Byun et.al. | 2504.14914 | null |
2025-04-20 | Translation Analytics for Freelancers: I. Introduction, Data Preparation, Baseline Evaluations | Yuri Balashov et.al. | 2504.14619 | null |
2025-04-20 | NTIRE 2025 Challenge on Real-World Face Restoration: Methods and Results | Zheng Chen et.al. | 2504.14600 | link |
2025-04-20 | SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization | Liang Peng et.al. | 2504.14534 | link |
2025-04-20 | Pulmonary electrical impedance tomography based on deep recurrent neural networks | Zhenzhong Song et.al. | 2504.14521 | null |
2025-04-19 | Single Document Image Highlight Removal via A Large-Scale Real-World Dataset and A Location-Aware Network | Lu Pan et.al. | 2504.14238 | null |
2025-04-19 | Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models | Xinlin Zhuang et.al. | 2504.14194 | null |
2025-04-18 | Point-Driven Interactive Text and Image Layer Editing Using Diffusion Models | Zhenyu Yu et.al. | 2504.14108 | null |
2025-04-18 | Towards Scale-Aware Low-Light Enhancement via Structure-Guided Transformer Design | Wei Dong et.al. | 2504.14075 | link |
2025-04-18 | Fashion-RAG: Multimodal Fashion Image Editing via Retrieval-Augmented Generation | Fulvio Sanguigni et.al. | 2504.14011 | null |
2025-04-18 | Entropy Rectifying Guidance for Diffusion and Flow Models | Tariq Berrada Ifriqi et.al. | 2504.13987 | null |
2025-04-18 | SupResDiffGAN a new approach for the Super-Resolution task | Dawid Kopeć et.al. | 2504.13622 | null |
2025-04-18 | Entropic Time Schedulers for Generative Diffusion Models | Dejan Stancevic et.al. | 2504.13612 | null |
2025-04-18 | U-Shape Mamba: State Space Model for faster diffusion | Alex Ergasti et.al. | 2504.13499 | link |
2025-04-17 | Auto-FEDUS: Autoregressive Generative Modeling of Doppler Ultrasound Signals from Fetal Electrocardiograms | Alireza Rafiei et.al. | 2504.13233 | null |
2025-04-17 | NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results | Xin Li et.al. | 2504.13131 | link |
2025-04-18 | SkyReels-V2: Infinite-length Film Generative Model | Guibin Chen et.al. | 2504.13074 | link |
2025-04-17 | Imaging for All-Day Wearable Smart Glasses | Michael Goesele et.al. | 2504.13060 | null |
2025-04-17 | TTRD3: Texture Transfer Residual Denoising Dual Diffusion Model for Remote Sensing Image Super-Resolution | Yide Liu et.al. | 2504.13026 | link |
2025-04-17 | Efficient Chebyshev Reconstruction for the Anisotropic Equilibrium Model in Magnetic Particle Imaging | Christine Droigk et.al. | 2504.12981 | null |
2025-04-17 | Sparks of Science: Hypothesis Generation Using Structured Paper Data | Charles O'Neill et.al. | 2504.12976 | null |
2025-04-17 | MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection | Long Qian et.al. | 2504.12970 | null |
2025-04-17 | Saliency-Aware Diffusion Reconstruction for Effective Invisible Watermark Removal | Inzamamul Alam et.al. | 2504.12809 | link |
2025-04-17 | ARAP-GS: Drag-driven As-Rigid-As-Possible 3D Gaussian Splatting Editing with Diffusion Prior | Xiao Han et.al. | 2504.12788 | null |
2025-04-17 | Mask Image Watermarking | Runyi Hu et.al. | 2504.12739 | link |
2025-04-17 | SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction Understanding | Qianqian Sun et.al. | 2504.12704 | null |
2025-04-17 | Autonomous Drone for Dynamic Smoke Plume Tracking | Srijan Kumar Pal et.al. | 2504.12664 | null |
2025-04-17 | Packing Input Frame Context in Next-Frame Prediction Models for Video Generation | Lvmin Zhang et.al. | 2504.12626 | link |
2025-04-17 | AdaQual-Diff: Diffusion-Based Image Restoration via Adaptive Quality Prompting | Xin Su et.al. | 2504.12605 | null |
2025-04-16 | InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework | Jiale Tao et.al. | 2504.12395 | link |
2025-04-16 | SIDME: Self-supervised Image Demoiréing via Masked Encoder-Decoder Reconstruction | Xia Wang et.al. | 2504.12245 | null |
2025-04-16 | Coding-Prior Guided Diffusion Network for Video Deblurring | Yike Liu et.al. | 2504.12222 | null |
2025-04-16 | Anti-Aesthetics: Protecting Facial Privacy against Customized Text-to-Image Synthesis | Songping Wang et.al. | 2504.12129 | null |
2025-04-17 | Understanding Attention Mechanism in Video Diffusion Models | Bingyan Liu et.al. | 2504.12027 | null |
2025-04-16 | Instruction-augmented Multimodal Alignment for Image-Text and Element Matching | Xinli Yue et.al. | 2504.12018 | null |
2025-04-16 | PCDiff: Proactive Control for Ownership Protection in Diffusion Models with Watermark Compatibility | Keke Gai et.al. | 2504.11774 | null |
2025-04-17 | DVLTA-VQA: Decoupled Vision-Language Modeling with Text-Guided Adaptation for Blind Video Quality Assessment | Li Yu et.al. | 2504.11733 | null |
2025-04-16 | Measuring Global Migration Flows using Online Data | Guanghua Chi et.al. | 2504.11691 | null |
2025-04-15 | AskQE: Question Answering as Automatic Evaluation for Machine Translation | Dayeon Ki et.al. | 2504.11582 | null |
2025-04-15 | A Comparative Evaluation of CT Global Noise Calculation Methods for Clinical Image Quality Assessment | Charles M Weaver et.al. | 2504.11578 | null |
2025-04-15 | ADT: Tuning Diffusion Models with Adversarial Supervision | Dazhong Shen et.al. | 2504.11423 | null |
2025-04-15 | Ring Artifacts Correction Based on Global-Local Features Interaction Guidance in the Projection Domain | Yunze Liu et.al. | 2504.11375 | null |
2025-04-16 | Seedream 3.0 Technical Report | Yu Gao et.al. | 2504.11346 | null |
2025-04-15 | Distillation-Supervised Convolutional Low-Rank Adaptation for Efficient Image Super-Resolution | Xinning Chai et.al. | 2504.11271 | link |
2025-04-15 | SAR-to-RGB Translation with Latent Diffusion for Earth Observation | Kaan Aydin et.al. | 2504.11154 | null |
2025-04-15 | Taming Consistency Distillation for Accelerated Human Image Animation | Xiang Wang et.al. | 2504.11143 | null |
2025-04-15 | Document Quality Scoring for Web Crawling | Francesca Pezzuti et.al. | 2504.11011 | link |
2025-04-15 | AgentPolyp: Accurate Polyp Segmentation via Image Enhancement Agent | Pu Wang et.al. | 2504.10978 | null |
2025-04-15 | Bringing together invertible UNets with invertible attention modules for memory-efficient diffusion models | Karan Jain et.al. | 2504.10883 | null |
2025-04-15 | LayoutCoT: Unleashing the Deep Reasoning Potential of Large Language Models for Layout Generation | Hengyu Shi et.al. | 2504.10829 | null |
2025-04-15 | Efficient and Robust Remote Sensing Image Denoising Using Randomized Approximation of Geodesics' Gramian on the Manifold Underlying the Patch Space | Kelum Gajamannage et.al. | 2504.10820 | null |
2025-04-15 | The Art of Audience Engagement: LLM-Based Thin-Slicing of Scientific Talks | Ralf Schmälzle et.al. | 2504.10768 | null |
2025-04-15 | Trade-offs in Privacy-Preserving Eye Tracking through Iris Obfuscation: A Benchmarking Study | Mengdi Wang et.al. | 2504.10267 | link |
2025-04-14 | Aligning Anime Video Generation with Human Feedback | Bingwen Zhu et.al. | 2504.10044 | null |
2025-04-14 | Progressive Transfer Learning for Multi-Pass Fundus Image Restoration | Uyen Phan et.al. | 2504.10025 | null |
2025-04-14 | A Theory of Universal Rate-Distortion-Classification Representations for Lossy Compression | Nam Nguyen et.al. | 2504.09932 | null |
2025-04-14 | EquiVDM: Equivariant Video Diffusion Models with Temporally Consistent Noise | Chao Liu et.al. | 2504.09789 | null |
2025-04-13 | SPICE: A Synergistic, Precise, Iterative, and Customizable Image Editing Workflow | Kenan Tang et.al. | 2504.09697 | link |
2025-04-13 | KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation | Xingrui Wang et.al. | 2504.09656 | null |
2025-04-13 | Trajectory-guided Motion Perception for Facial Expression Quality Assessment in Neurological Disorders | Shuchao Duan et.al. | 2504.09530 | link |
2025-04-13 | A Secure Communication Protocol for Remote Keyless Entry System with Adaptive Adjustment of Transmission Parameters | Jingjing Guo et.al. | 2504.09527 | null |
2025-04-13 | Enhancing Wide-Angle Image Using Narrow-Angle View of the Same Scene | Hussain Md. Safwan et.al. | 2504.09455 | link |
2025-04-12 | Towards Explainable Partial-AIGC Image Quality Assessment | Jiaying Qian et.al. | 2504.09291 | null |
2025-04-12 | FVQ: A Large-Scale Dataset and A LMM-based Method for Face Video Quality Assessment | Sijing Wu et.al. | 2504.09255 | link |
2025-04-12 | Universal Rate-Distortion-Classification Representations for Lossy Compression | Nam Nguyen et.al. | 2504.09025 | null |
2025-04-11 | End-to-End Demonstration of Quantum Generative Adversarial Networks for Steel Microstructure Image Augmentation on a Trapped-Ion Quantum Computer | Samwel Sekwao et.al. | 2504.08728 | null |
2025-04-11 | Generating Fine Details of Entity Interactions | Xinyi Gu et.al. | 2504.08714 | null |
2025-04-11 | Quality evaluation of Tabby coding assistant using real source code snippets | Marta Borek et.al. | 2504.08650 | link |
2025-04-11 | Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization | Jialu Li et.al. | 2504.08641 | null |
2025-04-11 | Shadow Erosion and Nighttime Adaptability for Camera-Based Automated Driving Applications | Mohamed Sabry et.al. | 2504.08551 | null |
2025-04-11 | Quality Diversity for Variational Quantum Circuit Optimization | Maximilian Zorn et.al. | 2504.08459 | link |
2025-04-11 | A Knowledge-guided Adversarial Defense for Resisting Malicious Visual Manipulation | Dawei Zhou et.al. | 2504.08411 | null |
2025-04-11 | MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft | Junliang Guo et.al. | 2504.08388 | null |
2025-04-11 | LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs | Jiarui Wang et.al. | 2504.08358 | link |
2025-04-11 | All-in-Memory Stochastic Computing using ReRAM | João Paulo C. de Lima et.al. | 2504.08340 | null |
2025-04-10 | Gen3DEval: Using vLLMs for Automatic Evaluation of Generated 3D Objects | Shalini Maiti et.al. | 2504.08125 | null |
2025-04-10 | SRVP: Strong Recollection Video Prediction Model Using Attention-Based Spatiotemporal Correlation Fusion | Yuseon Kim et.al. | 2504.08012 | link |
2025-04-10 | PixelFlow: Pixel-Space Generative Models with Flow | Shoufa Chen et.al. | 2504.07963 | link |
2025-04-10 | TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs | Zijian Zhang et.al. | 2504.07556 | null |
2025-04-10 | AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation | Tuhin Chakrabarty et.al. | 2504.07532 | link |
2025-04-10 | AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery | Amirhossein Abaskohi et.al. | 2504.07421 | link |
2025-04-09 | Dependency Update Adoption Patterns in the Maven Software Ecosystem | Baltasar Berretta et.al. | 2504.07310 | null |
2025-04-09 | MoEDiff-SR: Mixture of Experts-Guided Diffusion Model for Region-Adaptive MRI Super-Resolution | Zhe Wang et.al. | 2504.07308 | link |
2025-04-09 | Q-Agent: Quality-Driven Chain-of-Thought Image Restoration Agent through Robust Multimodal Large Language Model | Yingjie Zhou et.al. | 2504.07148 | null |
2025-04-09 | End2end-ALARA: Approaching the ALARA Law in CT Imaging with End-to-end Learning | Xi Tao et.al. | 2504.06777 | null |
2025-04-09 | RAGME: Retrieval Augmented Video Generation for Enhanced Motion Realism | Elia Peruzzo et.al. | 2504.06672 | null |
2025-04-10 | Subjective Visual Quality Assessment for High-Fidelity Learning-Based Image Compression | Mohsen Jenadeleh et.al. | 2504.06301 | link |
2025-04-08 | HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance | Jiazi Bu et.al. | 2504.06232 | null |
2025-04-08 | CamContextI2V: Context-aware Controllable Video Generation | Luis Denninger et.al. | 2504.06022 | link |
2025-04-08 | ViralQC: A Tool for Assessing Completeness and Contamination of Predicted Viral Contigs | Cheng Peng et.al. | 2504.05790 | link |
2025-04-08 | A Lightweight Multi-Module Fusion Approach for Korean Character Recognition | Inho Jake Park et.al. | 2504.05770 | null |
2025-04-08 | STRIVE: A Think & Improve Approach with Iterative Refinement for Enhancing Question Quality Estimation | Aniket Deroy et.al. | 2504.05693 | null |
2025-04-07 | Improved Stochastic Texture Filtering Through Sample Reuse | Bartlomiej Wronski et.al. | 2504.05562 | null |
2025-04-07 | Towards Efficient Real-Time Video Motion Transfer via Generative Time Series Modeling | Tasmiah Haque et.al. | 2504.05537 | null |
2025-04-07 | L3GS: Layered 3D Gaussian Splats for Efficient 3D Scene Delivery | Yi-Zhen Tsai et.al. | 2504.05517 | link |
2025-04-07 | Let it Snow! Animating Static Gaussian Scenes With Dynamic Weather Effects | Gal Fiebelman et.al. | 2504.05296 | null |
2025-04-07 | Balancing Task-invariant Interaction and Task-specific Adaptation for Unified Image Fusion | Xingyu Hu et.al. | 2504.05164 | null |
2025-04-07 | Content-Distortion High-Order Interaction for Blind Image Quality Assessment | Shuai Liu et.al. | 2504.05076 | null |
2025-04-07 | Low-Rate Semantic Communication with Codebook-based Conditional Generative Models | Kailang Ye et.al. | 2504.04977 | null |
2025-04-07 | Video-Bench: Human-Aligned Video Generation Benchmark | Hui Han et.al. | 2504.04907 | null |
2025-04-07 | Bidirectional Hierarchical Protein Multi-Modal Representation Learning | Xuefeng Liu et.al. | 2504.04770 | null |
2025-04-06 | BrainMRDiff: A Diffusion Model for Anatomically Consistent Brain MRI Synthesis | Moinak Bhattacharya et.al. | 2504.04532 | null |
2025-04-06 | FluentLip: A Phonemes-Based Two-stage Approach for Audio-Driven Lip Synthesis with Optical Flow Consistency | Shiyan Liu et.al. | 2504.04427 | null |
2025-04-05 | Multi-identity Human Image Animation with Structural Video Diffusion | Zhenzhi Wang et.al. | 2504.04126 | null |
2025-04-05 | Mapping at First Sense: A Lightweight Neural Network-Based Indoor Structures Prediction Method for Robot Autonomous Exploration | Haojia Gao et.al. | 2504.04061 | null |
2025-04-05 | OpenCodeInstruct: A Large-scale Instruction Tuning Dataset for Code LLMs | Wasi Uddin Ahmad et.al. | 2504.04030 | null |
2025-04-05 | DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion | Maksim Siniukov et.al. | 2504.04010 | null |
2025-04-04 | From Keypoints to Realism: A Realistic and Accurate Virtual Try-on Network from 2D Images | Maliheh Toozandehjani et.al. | 2504.03807 | null |
2025-04-04 | Quantifying the uncertainty of model-based synthetic image quality metrics | Ciaran Bench et.al. | 2504.03623 | null |
2025-04-04 | Multimodal Diffusion Bridge with Attention-Based SAR Fusion for Satellite Image Cloud Removal | Yuyang Hu et.al. | 2504.03607 | null |
2025-04-04 | BUFF: Bayesian Uncertainty Guided Diffusion Probabilistic Model for Single Image Super-Resolution | Zihao He et.al. | 2504.03490 | null |
2025-04-04 | NeRFlex: Resource-aware Real-time High-quality Rendering of Complex Scenes on Mobile Devices | Zhe Wang et.al. | 2504.03415 | null |
2025-04-04 | Point Cloud Objective Quality: Benchmarking Features and Quality Evaluation | Joao Prazeres et.al. | 2504.03381 | null |
2025-04-04 | Space-Time Encoded Modulation for High-Fidelity Diffuse Optical Imaging | Ben Wiesel et.al. | 2504.03246 | null |
2025-04-04 | Three Forensic Cues for JPEG AI Images | Sandra Bergmann et.al. | 2504.03191 | null |
2025-04-04 | FontGuard: A Robust Font Watermarking Approach Leveraging Deep Font Knowledge | Kahim Wong et.al. | 2504.03128 | link |
2025-04-03 | Compressing 3D Gaussian Splatting by Noise-Substituted Vector Quantization | Haishan Wang et.al. | 2504.03059 | link |
2025-04-03 | Fuzzy Implicative Rules: A Unified Approach | Raquel Fernandez-Peralta et.al. | 2504.03000 | null |
2025-04-03 | Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization | Kangle Deng et.al. | 2504.02817 | null |
2025-04-03 | Development of Automated Data Quality Assessment and Evaluation Indices by Analytical Experience | Yuka Haruki et.al. | 2504.02663 | null |
2025-04-03 | Charm: The Missing Piece in ViT fine-tuning for Image Aesthetic Assessment | Fatemeh Behrad et.al. | 2504.02522 | link |
2025-04-03 | MultiNeRF: Multiple Watermark Embedding for Neural Radiance Fields | Yash Kulthe et.al. | 2504.02517 | null |
2025-04-03 | Translation of Fetal Brain Ultrasound Images into Pseudo-MRI Images using Artificial Intelligence | Naomi Silverstein et.al. | 2504.02408 | null |
2025-04-03 | SemiISP/SemiIE: Semi-Supervised Image Signal Processor and Image Enhancement Leveraging One-to-Many Mapping sRGB-to-RAW | Masakazu Yoshimura et.al. | 2504.02345 | null |
2025-04-03 | ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation | Yuan Zhou et.al. | 2504.02316 | link |
2025-04-03 | Image Coding for Machines via Feature-Preserving Rate-Distortion Optimization | Samuel Fernández-Menduiña et.al. | 2504.02216 | null |
2025-04-02 | Foreground Focus: Enhancing Coherence and Fidelity in Camouflaged Image Generation | Pei-Chi Chen et.al. | 2504.02180 | null |
2025-04-02 | BioAtt: Anatomical Prior Driven Low-Dose CT Denoising | Namhun Kim et.al. | 2504.01662 | null |
2025-04-02 | Q-Adapt: Adapting LMM for Visual Quality Assessment with Progressive Instruction Tuning | Yiting Lu et.al. | 2504.01655 | link |
2025-04-02 | RealityAvatar: Towards Realistic Loose Clothing Modeling in Animatable 3D Gaussian Avatars | Yahui Li et.al. | 2504.01559 | null |
2025-04-02 | Multi-Marker Similarity enables reduced-reference and interpretable image quality assessment in optical microscopy | Elena Corbetta et.al. | 2504.01537 | null |
2025-04-02 | Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment | Ziteng Cui et.al. | 2504.01503 | link |
2025-04-02 | FlowMotion: Target-Predictive Flow Matching for Realistic Text-Driven Human Motion Generation | Manolo Canales Cuba et.al. | 2504.01338 | null |
2025-04-01 | FUSION: Frequency-guided Underwater Spatial Image recOnstructioN | Jaskaran Singh Walia et.al. | 2504.01243 | null |
2025-04-01 | A Conformal Risk Control Framework for Granular Word Assessment and Uncertainty Calibration of CLIPScore Quality Estimates | Gonçalo Gomes et.al. | 2504.01225 | null |
2025-04-01 | Epistemic Alignment: A Mediating Framework for User-LLM Knowledge Delivery | Nicholas Clark et.al. | 2504.01205 | null |
2025-04-01 | Video Quality Assessment for Resolution Cross-Over in Live Sports | Jingwen Zhu et.al. | 2504.01190 | null |
2025-04-01 | ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations | Yubo Wang et.al. | 2504.00824 | null |
2025-04-01 | The GLASS-JWST Early Release Science Programme: The NIRISS Spectroscopic Catalogue | Peter J. Watson et.al. | 2504.00823 | link |
2025-04-01 | DropGaussian: Structural Regularization for Sparse-view Gaussian Splatting | Hyunwoo Park et.al. | 2504.00773 | null |
2025-04-01 | Enhancing Fundus Image-based Glaucoma Screening via Dynamic Global-Local Feature Integration | Yuzhuo Zhou et.al. | 2504.00431 | null |
2025-03-31 | Bayesian Imaging of Interferometric Data from Polarized Electromagnetic Signals | Philipp Arras et.al. | 2504.00227 | null |
2025-03-31 | Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation | Shengqiong Wu et.al. | 2503.24379 | null |
2025-03-31 | ERUPT: Efficient Rendering with Unposed Patch Transformer | Maxim V. Shugaev et.al. | 2503.24374 | null |
2025-03-31 | StochasticSplats: Stochastic Rasterization for Sorting-Free 3D Gaussian Splatting | Shakiba Kheradmand et.al. | 2503.24366 | null |
2025-03-31 | DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting | Seungjun Lee et.al. | 2503.24210 | null |
2025-03-31 | FineCausal: A Causal-Based Framework for Interpretable Fine-Grained Action Quality Assessment | Ruisheng Han et.al. | 2503.23911 | link |
2025-03-31 | Training-Free Text-Guided Image Editing with Visual Autoregressive Model | Yufei Wang et.al. | 2503.23897 | link |
2025-04-01 | Learned Image Compression and Restoration for Digital Pathology | SeonYeong Lee et.al. | 2503.23862 | link |
2025-03-30 | What Makes an Evaluation Useful? Common Pitfalls and Best Practices | Gil Gekker et.al. | 2503.23424 | null |
2025-03-30 | Improving underwater semantic segmentation with underwater image quality attention and muti-scale aggregation attention | Xin Zuo et.al. | 2503.23422 | link |
2025-03-30 | Visual Acuity Consistent Foveated Rendering towards Retinal Resolution | Zhi Zhang et.al. | 2503.23410 | null |
2025-03-30 | Map Feature Perception Metric for Map Generation Quality Assessment and Loss Optimization | Chenxing Sun et.al. | 2503.23370 | null |
2025-03-29 | NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representations | Zhenyu Tang et.al. | 2503.23162 | null |
2025-03-29 | STSA: Spatial-Temporal Semantic Alignment for Visual Dubbing | Zijun Ding et.al. | 2503.23039 | link |
2025-03-28 | Concept and Demonstration of a Low-cost Compact Electron Microscope Enabled by a Photothermionic Carbon Nanotube Cathode | Casimir Kuzyk et.al. | 2503.22910 | null |
2025-03-28 | Learning to Reason for Long-Form Story Generation | Alexander Gurung et.al. | 2503.22828 | link |
2025-03-28 | Q-Insight: Understanding Image Quality via Visual Reinforcement Learning | Weiqi Li et.al. | 2503.22679 | link |
2025-03-28 | Evaluation of Machine-generated Biomedical Images via A Tally-based Similarity Measure | Frank J. Brooks et.al. | 2503.22658 | null |
2025-03-28 | RELD: Regularization by Latent Diffusion Models for Image Restoration | Pasquale Cascarano et.al. | 2503.22563 | null |
2025-03-28 | Data Quality Matters: Quantifying Image Quality Impact on Machine Learning Performance | Christian Steinhauser et.al. | 2503.22375 | null |
2025-03-28 | Imperceptible but Forgeable: Practical Invisible Watermark Forgery via Diffusion Models | Ziping Dong et.al. | 2503.22330 | null |
2025-03-28 | Mono2Stereo: A Benchmark and Empirical Study for Stereo Conversion | Songsong Yu et.al. | 2503.22262 | null |
2025-03-27 | Multispectral Demosaicing via Dual Cameras | SaiKiran Tedla et.al. | 2503.22026 | null |
2025-03-27 | Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video | David Yifan Yao et.al. | 2503.21761 | link |
2025-03-27 | Lumina-Image 2.0: A Unified and Efficient Image Generative Framework | Qi Qin et.al. | 2503.21758 | link |
2025-03-27 | 3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models | Yuhan Zhang et.al. | 2503.21745 | null |
2025-03-27 | Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance | Jaywon Koo et.al. | 2503.21721 | null |
2025-03-27 | Audio-driven Gesture Generation via Deviation Feature in the Latent Space | Jiahui Chen et.al. | 2503.21616 | null |
2025-03-27 | In vivo dynamic optical coherence tomography of human skin with hardware- and software-based motion correction | Yu Guo et.al. | 2503.21384 | link |
2025-03-27 | Zero-Shot Visual Concept Blending Without Text Guidance | Hiroya Makino et.al. | 2503.21277 | link |
2025-03-27 | Reducing CT Metal Artifacts by Learning Latent Space Alignment with Gemstone Spectral Imaging Data | Wencheng Han et.al. | 2503.21259 | null |
2025-03-26 | Generalized Ray Tracing with Basis functions for Tomographic Projections | Youssef Haouchat et.al. | 2503.20907 | null |
2025-03-26 | Debiasing Kernel-Based Generative Models | Tian Qin et.al. | 2503.20825 | null |
2025-03-27 | Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy | Yinan Sun et.al. | 2503.20673 | null |
2025-03-26 | VPO: Aligning Text-to-Video Generation Models with Prompt Optimization | Jiale Cheng et.al. | 2503.20491 | link |
2025-03-26 | Adaptive Local Clustering over Attributed Graphs | Haoran Zheng et.al. | 2503.20488 | link |
2025-03-26 | Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability | Yingdong Shi et.al. | 2503.20483 | null |
2025-03-26 | 3D Convolutional Neural Networks for Improved Detection of Intracranial bleeding in CT Imaging | Bargava Subramanian et.al. | 2503.20306 | null |
2025-03-26 | Traversing Distortion-Perception Tradeoff using a Single Score-Based Generative Model | Yuhan Wang et.al. | 2503.20297 | null |
2025-03-26 | QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions | Siyin Wang et.al. | 2503.20290 | null |
2025-03-26 | EGVD: Event-Guided Video Diffusion Model for Physically Realistic Large-Motion Frame Interpolation | Ziran Zhang et.al. | 2503.20268 | link |
2025-03-25 | Scaling Down Text Encoders of Text-to-Image Diffusion Models | Lifu Wang et.al. | 2503.19897 | link |
2025-03-25 | LENVIZ: A High-Resolution Low-Exposure Night Vision Benchmark Dataset | Manjushree Aithal et.al. | 2503.19804 | null |
2025-03-25 | SITA: Structurally Imperceptible and Transferable Adversarial Attacks for Stylized Image Generation | Jingdan Kang et.al. | 2503.19791 | link |
2025-03-25 | EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction | Chengjie Ge et.al. | 2503.19721 | null |
2025-03-25 | Improved tissue sodium concentration quantification in breast cancer by reducing partial volume effects: a preliminary study | Olgica Zaric et.al. | 2503.19570 | null |
2025-03-25 | Single-Step Latent Consistency Model for Remote Sensing Image Super-Resolution | Xiaohui Sun et.al. | 2503.19505 | null |
2025-03-25 | AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset | Haiyu Zhang et.al. | 2503.19462 | null |
2025-03-26 | COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting | Jiaxin Zhang et.al. | 2503.19443 | link |
2025-03-25 | Exploring Semantic Feature Discrimination for Perceptual Image Super-Resolution and Opinion-Unaware No-Reference Image Quality Assessment | Guanglu Dong et.al. | 2503.19295 | link |
2025-03-25 | Learning Hazing to Dehazing: Towards Realistic Haze Generation for Real-World Image Dehazing | Ruiyi Wang et.al. | 2503.19262 | link |
2025-03-24 | Latent Space Class Dispersion: Effective Test Data Quality Assessment for DNNs | Vivek Vekariya et.al. | 2503.18799 | null |
2025-03-24 | Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition | Yifei Zhang et.al. | 2503.18746 | link |
2025-03-24 | Generative Dataset Distillation using Min-Max Diffusion Model | Junqiao Fan et.al. | 2503.18626 | null |
2025-03-25 | AMD-Hummingbird: Towards an Efficient Text-to-Video Model | Takashi Isobe et.al. | 2503.18559 | link |
2025-03-24 | EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation | Qiang Qu et.al. | 2503.18552 | null |
2025-03-24 | Uncertainty-guided Perturbation for Image Super-Resolution Diffusion Model | Leheng Zhang et.al. | 2503.18512 | null |
2025-03-24 | MuMA: 3D PBR Texturing via Multi-Channel Multi-View Generation and Agentic Post-Processing | Lingting Zhu et.al. | 2503.18461 | null |
2025-03-24 | Panorama Generation From NFoV Image Done Right | Dian Zheng et.al. | 2503.18420 | link |
2025-03-24 | Limited-angle SPECT image reconstruction using deep image prior | Kensuke Hori et.al. | 2503.18342 | null |
2025-03-23 | Collaborating with AI Agents: Field Experiments on Teamwork, Productivity, and Performance | Harang Ju et.al. | 2503.18238 | link |
2025-03-23 | TCFG: Tangential Damping Classifier-free Guidance | Mingi Kwon et.al. | 2503.18137 | null |
2025-03-23 | Real-World Remote Sensing Image Dehazing: Benchmark and Baseline | Zeng-Hui Zhu et.al. | 2503.17966 | link |
2025-03-23 | Cross-Domain Underwater Image Enhancement Guided by No-Reference Image Quality Assessment: A Transfer Learning Approach | Zhi Zhang et.al. | 2503.17937 | null |
2025-03-23 | Guided Diffusion for the Extension of Machine Vision to Human Visual Perception | Takahiro Shindo et.al. | 2503.17907 | null |
2025-03-22 | DVG-Diffusion: Dual-View Guided Diffusion Model for CT Reconstruction from X-Rays | Xing Xie et.al. | 2503.17804 | null |
2025-03-22 | Improving Preference Extraction In LLMs By Identifying Latent Knowledge Through Classifying Probes | Sharan Maiya et.al. | 2503.17755 | null |
2025-03-22 | MAMAT: 3D Mamba-Based Atmospheric Turbulence Removal and its Object Detection Capability | Paul Hill et.al. | 2503.17700 | null |
2025-03-22 | DCEvo: Discriminative Cross-Dimensional Evolutionary Learning for Infrared and Visible Image Fusion | Jinyuan Liu et.al. | 2503.17673 | link |
2025-03-21 | Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks | Bhishma Dedhia et.al. | 2503.17539 | null |
2025-03-21 | ProDehaze: Prompting Diffusion Models Toward Faithful Image Dehazing | Tianwen Zhou et.al. | 2503.17488 | link |
2025-03-21 | Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images | Jie Mei et.al. | 2503.17261 | link |
2025-03-21 | FFaceNeRF: Few-shot Face Editing in Neural Radiance Fields | Kwan Yun et.al. | 2503.17095 | link |
2025-03-21 | STFTCodec: High-Fidelity Audio Compression through Time-Frequency Domain Representation | Tao Feng et.al. | 2503.16989 | null |
2025-03-21 | Uncertainty-Driven Modeling of Microporosity and Permeability in Clastic Reservoirs Using Random Forest | Muhammad Risha et.al. | 2503.16957 | null |
2025-03-21 | MagicColor: Multi-Instance Sketch Colorization | Yinhan Zhang et.al. | 2503.16948 | null |
2025-03-21 | Design of 3D Non-Cartesian Trajectories for Fast Volumetric MRI via Analytic Coordinate Discretization | Kwang Eun Jang et.al. | 2503.16918 | null |
2025-03-21 | Depth-Aided Color Image Inpainting in Quaternion Domain | Shunki Tatsumi et.al. | 2503.16818 | null |
2025-03-21 | A-IDE : Agent-Integrated Denoising Experts | Uihyun Cho et.al. | 2503.16780 | null |
2025-03-21 | On Explaining (Large) Language Models For Code Using Global Code-Based Explanations | David N. Palacio et.al. | 2503.16771 | null |
2025-03-20 | SAGE: Semantic-Driven Adaptive Gaussian Splatting in Extended Reality | Chiara Schiavo et.al. | 2503.16747 | null |
2025-03-20 | EDiT: Efficient Diffusion Transformers with Linear Compressed Attention | Philipp Becker et.al. | 2503.16726 | null |
2025-03-20 | Euclid: Star clusters in IC 342, NGC 2403, and Holmberg II | S. S. Larsen et.al. | 2503.16637 | null |
2025-03-20 | Fed-NDIF: A Noise-Embedded Federated Diffusion Model For Low-Count Whole-Body PET Denoising | Yinchi Zhou et.al. | 2503.16635 | null |
2025-03-20 | A Recipe for Generating 3D Worlds From a Single Image | Katja Schwarz et.al. | 2503.16611 | null |
2025-03-20 | 1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering | Yuheng Yuan et.al. | 2503.16422 | null |
2025-03-20 | MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance | Quanhao Li et.al. | 2503.16421 | null |
2025-03-20 | InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity | Liming Jiang et.al. | 2503.16418 | link |
2025-03-20 | ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos | Haolin Yang et.al. | 2503.16400 | null |
2025-03-20 | Gaussian Graph Network: Learning Efficient and Generalizable Gaussian Representations from Multi-view Images | Shengjun Zhang et.al. | 2503.16338 | null |
2025-03-20 | Enhancing Software Quality Assurance with an Adaptive Differential Evolution based Quantum Variational Autoencoder-Transformer Model | Seshu Babu Barma et.al. | 2503.16335 | null |
2025-03-20 | Do image and video quality metrics model low-level human vision? | Dounia Hammou et.al. | 2503.16264 | null |
2025-03-20 | Iterative Optimal Attention and Local Model for Single Image Rain Streak Removal | Xiangyu Li et.al. | 2503.16165 | link |
2025-03-20 | Automatically Generating Chinese Homophone Words to Probe Machine Translation Estimation Systems | Shenbin Qian et.al. | 2503.16158 | link |
2025-03-20 | 3-D Image-to-Image Fusion in Lightsheet Microscopy by Two-Step Adversarial Network: Contribution to the FuseMyCells Challenge | Marek Wodzinski et.al. | 2503.16075 | null |
2025-03-20 | PoseTraj: Pose-Aware Trajectory Control in Video Diffusion | Longbin Ji et.al. | 2503.16068 | null |
2025-03-20 | Single Image Iterative Subject-driven Generation and Editing | Yair Shpitzer et.al. | 2503.16025 | link |
2025-03-20 | A Survey on fMRI-based Brain Decoding for Reconstructing Multimodal Stimuli | Pengyu Liu et.al. | 2503.15978 | null |
2025-03-20 | GraPLUS: Graph-based Placement Using Semantics for Image Composition | Mir Mohammad Khaleghi et.al. | 2503.15761 | null |
2025-03-19 | 5D free-running, reconstruction, variable projection, ADMM, VPAL | Yitong Yang et.al. | 2503.15711 | null |
2025-03-19 | Toward task-driven satellite image super-resolution | Maciej Ziaja et.al. | 2503.15474 | null |
2025-03-19 | Learn Your Scales: Towards Scale-Consistent Generative Novel View Synthesis | Fereshteh Forghani et.al. | 2503.15412 | null |
2025-03-19 | Boosting HDR Image Reconstruction via Semantic Knowledge Transfer | Qingsen Yan et.al. | 2503.15361 | null |
2025-03-19 | Euclid Quick Data Release (Q1): VIS processing and data products | Euclid Collaboration et.al. | 2503.15303 | null |
2025-03-19 | Automated Non-Functional Requirements Generation in Software Engineering with Large Language Models: A Comparative Study | Jomar Thomas Almonte et.al. | 2503.15248 | null |
2025-03-19 | 3D Engine-ready Photorealistic Avatars via Dynamic Textures | Yifan Wang et.al. | 2503.14943 | null |
2025-03-19 | FetalFlex: Anatomy-Guided Diffusion Model for Flexible Control on Fetal Ultrasound Image Synthesis | Yaofei Duan et.al. | 2503.14906 | null |
2025-03-19 | Temporal-Consistent Video Restoration with Pre-trained Diffusion Models | Hengkang Wang et.al. | 2503.14863 | null |
2025-03-19 | ClimateGS: Real-Time Climate Simulation with 3D Gaussian Style Transfer | Yuezhen Xie et.al. | 2503.14845 | null |
2025-03-18 | Involution and BSConv Multi-Depth Distillation Network for Lightweight Image Super-Resolution | Akram Khatami-Rizi et.al. | 2503.14779 | null |
2025-03-18 | A Simple Combination of Diffusion Models for Better Quality Trade-Offs in Image Denoising | Jonas Dornbusch et.al. | 2503.14654 | null |
2025-03-18 | The Power of Context: How Multimodality Improves Image Super-Resolution | Kangfu Mei et.al. | 2503.14503 | null |
2025-03-18 | ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing | Yulin Pan et.al. | 2503.14482 | null |
2025-03-18 | Optimized 3D Gaussian Splatting using Coarse-to-Fine Image Frequency Modulation | Umar Farooq et.al. | 2503.14475 | null |
2025-03-18 | RFMI: Estimating Mutual Information on Rectified Flow for Text-to-Image Alignment | Chao Wang et.al. | 2503.14358 | null |
2025-03-18 | Four checks for low-fidelity synthetic data: recommendations for disclosure control and quality evaluation | Gillian M Raab et.al. | 2503.14211 | null |
2025-03-18 | RBFIM: Perceptual Quality Assessment for Compressed Point Clouds Using Radial Basis Function Interpolation | Zhang Chen et.al. | 2503.14154 | null |
2025-03-18 | Towards properties of adversarial image perturbations | Egor Kuznetsov et.al. | 2503.14111 | null |
2025-03-18 | Image-Based Metrics in Ultrasound for Estimation of Global Speed-of-Sound | Roman Denkin et.al. | 2503.14094 | null |
2025-03-18 | Fast Autoregressive Video Generation with Diagonal Decoding | Yang Ye et.al. | 2503.14070 | null |
2025-03-18 | YOLO-LLTS: Real-Time Low-Light Traffic Sign Detection via Prior-Guided Enhancement and Multi-Branch Feature Interaction | Ziyu Lin et.al. | 2503.13883 | null |
2025-03-17 | Zero-Shot Denoising for Fluorescence Lifetime Imaging Microscopy with Intensity-Guided Learning | Hao Chen et.al. | 2503.13779 | link |
2025-03-17 | FiVE: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models | Minghan Li et.al. | 2503.13684 | null |
2025-03-17 | One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation | Daniil Selikhanovych et.al. | 2503.13358 | null |
2025-03-17 | MagicDistillation: Weak-to-Strong Video Distillation for Large-Scale Portrait Few-Step Synthesis | Shitong Shao et.al. | 2503.13319 | null |
2025-03-19 | FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis | Luxi Chen et.al. | 2503.13265 | null |
2025-03-17 | Don't Judge Before You CLIP: A Unified Approach for Perceptual Tasks | Amit Zalcher et.al. | 2503.13260 | null |
2025-03-17 | MedLoRD: A Medical Low-Resource Diffusion Model for High-Resolution 3D CT Image Synthesis | Marvin Seyfarth et.al. | 2503.13211 | null |
2025-03-18 | Rethinking Image Evaluation in Super-Resolution | Shaolin Su et.al. | 2503.13074 | null |
2025-03-17 | DehazeMamba: SAR-guided Optical Remote Sensing Image Dehazing with Adaptive State Space Model | Zhicheng Zhao et.al. | 2503.13073 | null |
2025-03-17 | DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models | Dewei Zhou et.al. | 2503.12885 | null |
2025-03-17 | CompMarkGS: Robust Watermarking for Compression 3D Gaussian Splatting | Sumin In et.al. | 2503.12836 | null |
2025-03-17 | R3-Avatar: Record and Retrieve Temporal Codebook for Reconstructing Photorealistic Human Avatars | Yifan Zhan et.al. | 2503.12751 | null |
2025-03-17 | GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching | Feng Qiao et.al. | 2503.12720 | link |
2025-03-16 | GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation | Tao Feng et.al. | 2503.12600 | null |
2025-03-16 | BalancedDPO: Adaptive Multi-Metric Alignment | Dipesh Tamboli et.al. | 2503.12575 | null |
2025-03-16 | Segment Any-Quality Images with Generative Latent Space Enhancement | Guangqian Guo et.al. | 2503.12507 | null |
2025-03-16 | SING: Semantic Image Communications using Null-Space and INN-Guided Diffusion Models | Jiakang Chen et.al. | 2503.12484 | null |
2025-03-16 | Pathology Image Restoration via Mixture of Prompts | Jiangdong Cai et.al. | 2503.12399 | link |
2025-03-15 | DLA-Count: Dynamic Label Assignment Network for Dense Cell Distribution Counting | Yuqing Yan et.al. | 2503.12063 | null |
2025-03-15 | MoDM: Efficient Serving for Image Generation via Mixture-of-Diffusion Models | Yuchen Xia et.al. | 2503.11972 | null |
2025-03-14 | TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation | Hongxiang Zhao et.al. | 2503.11423 | null |
2025-03-14 | TransiT: Transient Transformer for Non-line-of-sight Videography | Ruiqian Li et.al. | 2503.11328 | null |
2025-03-14 | Safe-VAR: Safe Visual Autoregressive Model for Text-to-Image Generative Watermarking | Ziyi Wang et.al. | 2503.11324 | null |
2025-03-14 | Leveraging Diffusion Knowledge for Generative Image Compression with Fractal Frequency-Aware Band Learning | Lingyu Zhu et.al. | 2503.11321 | null |
2025-03-14 | Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption | Du Chen et.al. | 2503.11221 | null |
2025-03-14 | Zero-TIG: Temporal Consistency-Aware Zero-Shot Illumination-Guided Low-light Video Enhancement | Yini Li et.al. | 2503.11175 | link |
2025-03-14 | Unifying Perplexing Behaviors in Modified BP Attributions through Alignment Perspective | Guanhua Zheng et.al. | 2503.11160 | null |
2025-03-14 | GaussianIP: Identity-Preserving Realistic 3D Human Generation via Human-Centric Diffusion Prior | Zichen Tang et.al. | 2503.11143 | link |
2025-03-14 | MobiVital: Self-supervised Time-series Quality Estimation for Contactless Respiration Monitoring Using UWB Radar | Ziqi Wang et.al. | 2503.11064 | link |
2025-03-14 | Comparative Analysis of Advanced AI-based Object Detection Models for Pavement Marking Quality Assessment during Daytime | Gian Antariksa et.al. | 2503.11008 | null |
2025-03-13 | Statistical Analysis of Sentence Structures through ASCII, Lexical Alignment and PCA | Abhijeet Sahdev et.al. | 2503.10470 | null |
2025-03-13 | RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models | Yijing Lin et.al. | 2503.10406 | null |
2025-03-13 | MACS: Multi-source Audio-to-image Generation with Contextual Significance and Semantic Alignment | Hao Zhou et.al. | 2503.10287 | null |
2025-03-13 | KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception | Yunpeng Qu et.al. | 2503.10259 | link |
2025-03-13 | Automatic quality control in multi-centric fetal brain MRI super-resolution reconstruction | Thomas Sanchez et.al. | 2503.10156 | link |
2025-03-13 | Image Quality Assessment: From Human to Machine Preference | Chunyi Li et.al. | 2503.10078 | link |
2025-03-12 | Bidirectional Learned Facial Animation Codec for Low Bitrate Talking Head Videos | Riku Takahashi et.al. | 2503.09787 | null |
2025-03-12 | Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models | Sangwon Jang et.al. | 2503.09669 | null |
2025-03-12 | CoRe^2: Collect, Reflect and Refine to Generate Better and Faster | Shitong Shao et.al. | 2503.09662 | link |
2025-03-12 | Fair Federated Medical Image Classification Against Quality Shift via Inter-Client Progressive State Matching | Nannan Wu et.al. | 2503.09587 | link |
2025-03-12 | FCaS: Fine-grained Cardiac Image Synthesis based on 3D Template Conditional Diffusion Model | Jiahao Xia et.al. | 2503.09560 | null |
2025-03-12 | Multi-Agent Image Restoration | Xu Jiang et.al. | 2503.09403 | null |
2025-03-12 | Bidirectional Prototype-Reward co-Evolution for Test-Time Adaptation of Vision-Language Models | Xiaozhen Qiao et.al. | 2503.09394 | null |
2025-03-12 | PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image Modeling | Nikolai Körber et.al. | 2503.09368 | link |
2025-03-12 | Fully-Synthetic Training for Visual Quality Inspection in Automotive Production | Christoph Huber et.al. | 2503.09354 | null |
2025-03-12 | Unified Dense Prediction of Video Diffusion | Lehan Yang et.al. | 2503.09344 | null |
2025-03-12 | Experimental study of the first telescope with a toroidal curved detector | Eduard Muslimov et.al. | 2503.09300 | null |
2025-03-12 | IQPFR: An Image Quality Prior for Blind Face Restoration and Beyond | Peng Hu et.al. | 2503.09294 | null |
2025-03-12 | Better Together: Unified Motion Capture and 3D Avatar Reconstruction | Arthur Moreau et.al. | 2503.09293 | null |
2025-03-12 | Active Learning Inspired ControlNet Guidance for Augmenting Semantic Segmentation Datasets | Hannah Kniesel et.al. | 2503.09221 | null |
2025-03-12 | Teaching LMMs for Image Quality Scoring and Interpreting | Zicheng Zhang et.al. | 2503.09197 | link |
2025-03-11 | Residual Learning and Filtering Networks for End-to-End Lossless Video Compression | Md baharul Islam et.al. | 2503.08819 | null |
2025-03-11 | Posterior-Mean Denoising Diffusion Model for Realistic PET Image Reconstruction | Yiran Sun et.al. | 2503.08546 | null |
2025-03-11 | Segmentation-Guided CT Synthesis with Pixel-Wise Conformal Uncertainty Bounds | David Vallmanya Poch et.al. | 2503.08515 | null |
2025-03-11 | NullFace: Training-Free Localized Face Anonymization | Han-Wei Kung et.al. | 2503.08478 | link |
2025-03-11 | DG16M: A Large-Scale Dataset for Dual-Arm Grasping with Force-Optimized Grasps | Md Faizal Karim et.al. | 2503.08358 | null |
2025-03-11 | Pathology-Aware Adaptive Watermarking for Text-Driven Medical Image Synthesis | Chanyoung Kim et.al. | 2503.08346 | null |
2025-03-11 | Diffusion Transformer Meets Random Masks: An Advanced PET Reconstruction Framework | Bin Huang et.al. | 2503.08339 | null |
2025-03-11 | Feature Alignment with Equivariant Convolutions for Burst Image Super-Resolution | Xinyi Liu et.al. | 2503.08300 | null |
2025-03-11 | PromptLNet: Region-Adaptive Aesthetic Enhancement via Prompt Guidance in Low-Light Enhancement Net | Jun Yin et.al. | 2503.08276 | null |
2025-03-11 | ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting | Junfu Guo et.al. | 2503.08135 | null |
2025-03-10 | Artificial Intelligence in Deliberation: The AI Penalty and the Emergence of a New Deliberative Divide | Andreas Jungherr et.al. | 2503.07690 | null |
2025-03-10 | GM-MoE: Low-Light Enhancement with Gated-Mechanism Mixture-of-Experts | Minwen Liao et.al. | 2503.07417 | null |
2025-03-10 | SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion Models | Ouxiang Li et.al. | 2503.07392 | link |
2025-03-10 | Multimodal Human-AI Synergy for Medical Imaging Quality Control: A Hybrid Intelligence Framework with Adaptive Dataset Curation and Closed-Loop Evaluation | Zhi Qin et.al. | 2503.07032 | null |
2025-03-10 | Lightweight Multimodal Artificial Intelligence Framework for Maritime Multi-Scene Recognition | Xinyu Xi et.al. | 2503.06978 | null |
2025-03-09 | GenDR: Lightning Generative Detail Restorator | Yan Wang et.al. | 2503.06790 | null |
2025-03-09 | Unsupervised Multi-Clustering and Decision-Making Strategies for 4D-STEM Orientation Mapping | Junhao Cao et.al. | 2503.06699 | null |
2025-03-09 | PixelPonder: Dynamic Patch Adaptation for Enhanced Multi-Conditional Text-to-Image Generation | Yanjie Pan et.al. | 2503.06684 | null |
2025-03-09 | Learning Few-Step Diffusion Models by Trajectory Distribution Matching | Yihong Luo et.al. | 2503.06674 | link |
2025-03-09 | The New CMS Measure of Excessive Radiation Dose or Inadequate CT Image Quality: Methods for Size-Adjusted Dose and Their Variabilities | Gary Y Ge et.al. | 2503.06644 | null |
2025-03-09 | One-Step Diffusion Model for Image Motion-Deblurring | Xiaoyang Liu et.al. | 2503.06537 | link |
2025-03-08 | PTDiffusion: Free Lunch for Generating Optical Illusion Hidden Pictures with Phase-Transferred Diffusion Model | Xiang Gao et.al. | 2503.06186 | null |
2025-03-08 | BioMoDiffuse: Physics-Guided Biomechanical Diffusion for Controllable and Authentic Human Motion Synthesis | Zixi Kang et.al. | 2503.06151 | null |
2025-03-08 | Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language Model | Mingxing Li et.al. | 2503.06141 | null |
2025-03-08 | Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Flexible and Effective Paradigm | Jiebin Yan et.al. | 2503.06129 | link |
2025-03-08 | Feature Fusion Attention Network with CycleGAN for Image Dehazing, De-Snowing and De-Raining | Akshat Jain et.al. | 2503.06107 | null |
2025-03-07 | MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice | Hongwei Yi et.al. | 2503.05978 | null |
2025-03-07 | LapLoss: Laplacian Pyramid-based Multiscale loss for Image Translation | Krish Didwania et.al. | 2503.05974 | null |
2025-03-10 | VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context Control | Yuxuan Bian et.al. | 2503.05639 | link |
2025-03-07 | A-SEE2.0: Active-Sensing End-Effector for Robotic Ultrasound Systems with Dense Contact Surface Perception Enabled Probe Orientation Adjustment | Yernar Zhetpissov et.al. | 2503.05569 | null |
2025-03-07 | Development and Enhancement of Text-to-Image Diffusion Models | Rajdeep Roshan Sahu et.al. | 2503.05149 | null |
2025-03-07 | SMILENet: Unleashing Extra-Large Capacity Image Steganography via a Synergistic Mosaic InvertibLE Hiding Network | Jun-Jie Huang et.al. | 2503.05118 | null |
2025-03-06 | Toward Lightweight and Fast Decoders for Diffusion Models in Image and Video Generation | Alexey Buzovkin et.al. | 2503.04871 | link |
2025-03-08 | The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation | Aoxiong Yin et.al. | 2503.04606 | link |
2025-03-06 | In-Context Reverse Classification Accuracy: Efficient Estimation of Segmentation Quality without Ground-Truth | Matias Cosarinsky et.al. | 2503.04522 | null |
2025-03-06 | IMFine: 3D Inpainting via Geometry-guided Multi-view Refinement | Zhihao Shi et.al. | 2503.04501 | null |
2025-03-07 | LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding | Shen Zhang et.al. | 2503.04344 | null |
2025-03-05 | Positive-Unlabeled Diffusion Models for Preventing Sensitive Data Generation | Hiroshi Takahashi et.al. | 2503.03789 | null |
2025-03-05 | DO-IQS: Dynamics-Aware Offline Inverse Q-Learning for Optimal Stopping with Unknown Gain Functions | Anna Kuchko et.al. | 2503.03515 | null |
2025-03-05 | Automatic Drywall Analysis for Progress Tracking and Quality Control in Construction | Mariusz Trzeciakiewicz et.al. | 2503.03422 | null |
2025-03-05 | On the Relation Between Speech Quality and Quantized Latent Representations of Neural Codecs | Mhd Modar Halimeh et.al. | 2503.03304 | null |
2025-03-05 | Computational Analysis of Degradation Modeling in Blind Panoramic Image Quality Assessment | Jiebin Yan et.al. | 2503.03255 | null |
2025-03-05 | DSVD: Dynamic Self-Verify Decoding for Faithful Generation in Large Language Models | YiQiu Guo et.al. | 2503.03149 | null |
2025-03-04 | QE4PE: Word-level Quality Estimation for Human Post-Editing | Gabriele Sarti et.al. | 2503.03044 | link |
2025-03-04 | A Causal Framework for Aligning Image Quality Metrics and Deep Neural Network Robustness | Nathan Drenkow et.al. | 2503.02797 | null |
2025-03-04 | LADM: Long-context Training Data Selection with Attention-based Dependency Measurement for LLMs | Jianghao Chen et.al. | 2503.02502 | null |
2025-03-04 | Deep Robust Reversible Watermarking | Jiale Chen et.al. | 2503.02490 | null |
2025-03-04 | ERetinex: Event Camera Meets Retinex Theory for Low-Light Image Enhancement | Xuejian Guo et.al. | 2503.02484 | link |
2025-03-05 | Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content | Zicheng Zhang et.al. | 2503.02357 | link |
2025-03-04 | Exploring Simple Siamese Network for High-Resolution Video Quality Assessment | Guotao Shen et.al. | 2503.02330 | null |
2025-03-04 | Semantic Prior Distillation with Vision Foundation Model for Enhanced Rapid Bone Scintigraphy Image Restoration | Pengchen Liang et.al. | 2503.02321 | null |
2025-03-04 | Language-Guided Visual Perception Disentanglement for Image Quality Assessment and Conditional Image Generation | Zhichao Yang et.al. | 2503.02206 | null |
2025-03-04 | DarkDeblur: Learning single-shot image deblurring in low-light condition | S M A Sharif et.al. | 2503.02194 | link |
2025-03-03 | Integrating Misclassified EHR Outcomes with Validated Outcomes from a Non-probability Sample | Jenny Shen et.al. | 2503.02071 | null |
2025-03-03 | Quality Measures for Dynamic Graph Generative Models | Ryien Hosseini et.al. | 2503.01720 | link |
2025-03-03 | Evaluating LLMs' Assessment of Mixed-Context Hallucination Through the Lens of Summarization | Siya Qi et.al. | 2503.01670 | link |
2025-03-03 | MRI super-resolution reconstruction using efficient diffusion probabilistic model with residual shifting | Mojtaba Safari et.al. | 2503.01576 | link |
2025-03-03 | Evaluation and Facilitation of Online Discussions in the LLM Era: A Survey | Katerina Korre et.al. | 2503.01513 | null |
2025-03-03 | FlowDec: A flow-based full-band general audio codec with high perceptual quality | Simon Welker et.al. | 2503.01485 | link |
2025-03-03 | Improving the Efficiency of VVC using Partitioning of Reference Frames | Kamran Qureshi et.al. | 2503.01415 | null |
2025-03-03 | Wavelet-Enhanced Desnowing: A Novel Single Image Restoration Approach for Traffic Surveillance under Adverse Weather Conditions | Zihan Shen et.al. | 2503.01339 | null |
2025-03-03 | Reconciling Stochastic and Deterministic Strategies for Zero-shot Image Restoration using Diffusion Model in Dual | Chong Wang et.al. | 2503.01288 | link |
2025-03-03 | Every SAM Drop Counts: Embracing Semantic Priors for Multi-Modality Image Fusion and Beyond | Guanyao Wu et.al. | 2503.01210 | null |
2025-03-03 | DifIISR: A Diffusion Model with Gradient Guidance for Infrared Image Super-Resolution | Xingyuan Li et.al. | 2503.01187 | link |
2025-02-28 | Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos | Zhiyu Tan et.al. | 2502.21314 | null |
2025-02-28 | Bilevel Optimized Implicit Neural Representation for Scan-Specific Accelerated MRI Reconstruction | Hongze Yu et.al. | 2502.21292 | null |
2025-02-28 | Back to the Future Cyclopean Stereo: a human perception approach unifying deep and geometric constraints | Sherlon Almeida da Silva et.al. | 2502.21280 | null |
2025-02-28 | Does Generation Require Memorization? Creative Diffusion Models using Ambient Diffusion | Kulin Shah et.al. | 2502.21278 | null |
2025-02-28 | PET Image Denoising via Text-Guided Diffusion: Integrating Anatomical Priors through Text Prompts | Boxiao Yu et.al. | 2502.21260 | null |
2025-02-28 | Training-free and Adaptive Sparse Attention for Efficient Long Video Generation | Yifei Xia et.al. | 2502.21079 | null |
2025-02-28 | Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal | Haonan An et.al. | 2502.20924 | null |
2025-02-28 | Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision | Dawei Zhu et.al. | 2502.20790 | null |
2025-02-28 | WorldModelBench: Judging Video Generation Models As World Models | Dacheng Li et.al. | 2502.20694 | null |
2025-02-28 | Advancing AI-Powered Medical Image Synthesis: Insights from MedVQA-GI Challenge Using CLIP, Fine-Tuned Stable Diffusion, and Dream-Booth + LoRA | Ojonugwa Oluwafemi Ejiga Peter et.al. | 2502.20667 | null |
2025-02-27 | FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction | Siyu Jiao et.al. | 2502.20313 | link |
2025-02-27 | Mobius: Text to Seamless Looping Video Generation via Latent Shift | Xiuli Bi et.al. | 2502.20307 | link |
2025-02-27 | Low-rank tensor completion via a novel minimax |
Hongbing Zhang et.al. | 2502.19979 | null |
2025-02-28 | Alleviating Distribution Shift in Synthetic Data for Machine Translation Quality Estimation | Xiang Geng et.al. | 2502.19941 | null |
2025-02-27 | Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents | Zhenyu Liu et.al. | 2502.19917 | link |
2025-02-27 | High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model | Mingtao Guo et.al. | 2502.19894 | link |
2025-02-27 | Striving for Faster and Better: A One-Layer Architecture with Auto Re-parameterization for Low-Light Image Enhancement | Nan An et.al. | 2502.19867 | null |
2025-02-27 | LMHLD: A Large-scale Multi-source High-resolution Landslide Dataset for Landslide Detection based on Deep Learning | Guanting Liu et.al. | 2502.19866 | null |
2025-02-27 | Adaptive Score Alignment Learning for Continual Perceptual Quality Assessment of 360-Degree Videos in Virtual Reality | Kanglei Zhou et.al. | 2502.19644 | link |
2025-02-26 | 3D Nephrographic Image Synthesis in CT Urography with the Diffusion Model and Swin Transformer | Hongkun Yu et.al. | 2502.19623 | null |
2025-02-26 | Distill Not Only Data but Also Rewards: Can Smaller Language Models Surpass Larger Ones? | Yudi Zhang et.al. | 2502.19557 | null |
2025-02-26 | CLIP-Optimized Multimodal Image Enhancement via ISP-CNN Fusion for Coal Mine IoVT under Uneven Illumination | Shuai Wang et.al. | 2502.19450 | null |
2025-02-26 | Does 3D Gaussian Splatting Need Accurate Volumetric Rendering? | Adam Celarek et.al. | 2502.19318 | link |
2025-02-27 | RetinaRegen: A Hybrid Model for Readability and Detail Restoration in Fundus Images | Yuhan Tang et.al. | 2502.19153 | null |
2025-02-26 | Max360IQ: Blind Omnidirectional Image Quality Assessment with Multi-axis Attention | Jiebin Yan et.al. | 2502.19046 | link |
2025-02-26 | InternVQA: Advancing Compressed Video Quality Assessment with Distilling Large Foundation Model | Fengbin Guan et.al. | 2502.19026 | null |
2025-02-26 | Hyperspectral image reconstruction by deep learning with super-Rayleigh speckles | Ziyan Chen et.al. | 2502.18777 | null |
2025-02-25 | Is OpenAlex Suitable for Research Quality Evaluation and Which Citation Indicator is Best? | Mike Thelwall et.al. | 2502.18427 | null |
2025-02-25 | LAG: LLM agents for Leaderboard Auto Generation on Demanding | Jian Wu et.al. | 2502.18209 | null |
2025-02-25 | OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation | Yunpeng Gao et.al. | 2502.18041 | null |
2025-02-25 | Towards Better Understanding of Program-of-Thought Reasoning in Cross-Lingual and Multilingual Environments | Patomporn Payoungkhamdee et.al. | 2502.17956 | null |
2025-02-25 | Integrating Boosted learning with Differential Evolution (DE) Optimizer: A Prediction of Groundwater Quality Risk Assessment in Odisha | Sonalika Subudhi et.al. | 2502.17929 | null |
2025-02-24 | Optimized Memory System Architecture for VESA VDC-M Decoder with Multi-Slice Support | Hannah Yang et.al. | 2502.17729 | null |
2025-02-24 | Requirements for Quality Assurance of AI Models for Early Detection of Lung Cancer | Horst K. Hahn et.al. | 2502.17639 | null |
2025-02-25 | KV-Edit: Training-Free Image Editing for Precise Background Preservation | Tianrui Zhu et.al. | 2502.17363 | link |
2025-02-24 | Motion-Robust T2 Quantification from Gradient Echo MRI with Physics-Informed Deep Learning* | Hannah Eichhorn et.al. | 2502.17209 | null |
2025-02-24 | SFLD: Reducing the content bias for AI-generated Image Detection | Seoyeon Gye et.al. | 2502.17105 | null |
2025-02-24 | Pleno-Generation: A Scalable Generative Face Video Compression Framework with Bandwidth Intelligence | Bolin Chen et.al. | 2502.17085 | null |
2025-02-24 | PQDAST: Depth-Aware Arbitrary Style Transfer for Games via Perceptual Quality-Guided Distillation | Eleftherios Ioannou et.al. | 2502.16996 | null |
2025-02-24 | Multi-Dimensional Quality Assessment for Text-to-3D Assets: Dataset and Model | Kang Fu et.al. | 2502.16915 | link |
2025-02-24 | CRTrack: Low-Light Semi-Supervised Multi-object Tracking Based on Consistency Regularization | Zijing Zhao et.al. | 2502.16809 | null |
2025-02-23 | Automatic Input Rewriting Improves Translation with Large Language Models | Dayeon Ki et.al. | 2502.16682 | link |
2025-02-23 | AdverX-Ray: Ensuring X-Ray Integrity Through Frequency-Sensitive Adversarial VAEs | Francisco Caetano et.al. | 2502.16610 | link |
2025-02-22 | Multi-Party Data Pricing for Complex Data Trading Markets: A Rubinstein Bargaining Approach | Bing Mi et.al. | 2502.16363 | null |
2025-02-21 | Improved Partial Differential Equation and Fast Approximation Algorithm for Hazy/Underwater/Dust Storm Image Enhancement | Uche A. Nnolim et.al. | 2502.15986 | null |
2025-02-21 | Evaluate with the Inverse: Efficient Approximation of Latent Explanation Quality Distribution | Carlos Eiras-Franco et.al. | 2502.15403 | null |
2025-02-21 | Super-Resolution for Interferometric Imaging: Model Comparisons and Performance Analysis | Hasan Berkay Abdioglu et.al. | 2502.15397 | null |
2025-02-21 | Ultrasound Phase Aberrated Point Spread Function Estimation with Convolutional Neural Network: Simulation Study | Wei-Hsiang Shen et.al. | 2502.15298 | null |
2025-02-21 | Omnidirectional Image Quality Captioning: A Large-scale Database and A New Model | Jiebin Yan et.al. | 2502.15271 | link |
2025-02-21 | Lung-DDPM: Semantic Layout-guided Diffusion Models for Thoracic CT Image Synthesis | Yifan Jiang et.al. | 2502.15204 | link |
2025-02-21 | LUMINA-Net: Low-light Upgrade through Multi-stage Illumination and Noise Adaptation Network for Image Enhancement | Namrah Siddiqua et.al. | 2502.15186 | null |
2025-02-21 | M3-AGIQA: Multimodal, Multi-Round, Multi-Aspect AI-Generated Image Quality Assessment | Chuan Cui et.al. | 2502.15167 | link |
2025-02-21 | Optimized Pap Smear Image Enhancement: Hybrid PMD Filter-CLAHE Using Spider Monkey Optimization | Ach Khozaimi et.al. | 2502.15156 | null |
2025-02-20 | Hardware-Friendly Static Quantization Method for Video Diffusion Transformers | Sanghyun Yi et.al. | 2502.15077 | null |
2025-02-20 | Multi-Source Static CT with Adaptive Fluence Modulation to Minimize Hallucinations in Generative Reconstructions | Matthew Tivnan et.al. | 2502.15060 | null |
2025-02-20 | GS-Cache: A GS-Cache Inference Framework for Large-scale Gaussian Splatting Models | Miao Tao et.al. | 2502.14938 | null |
2025-02-20 | Compact Latent Representation for Image Compression (CLRIC) | Ayman A. Ameen et.al. | 2502.14937 | null |
2025-02-20 | Benchmarking Multimodal RAG through a Chart-based Document Question-Answering Generation Framework | Yuming Yang et.al. | 2502.14864 | link |
2025-02-20 | Towards a Perspectivist Turn in Argument Quality Assessment | Julia Romberg et.al. | 2502.14501 | link |
2025-02-20 | Early-Exit and Instant Confidence Translation Quality Estimation | Vilém Zouhar et.al. | 2502.14429 | link |
2025-02-20 | NeRF-3DTalker: Neural Radiance Field with 3D Prior Aided Audio Disentanglement for Talking Head Synthesis | Xiaoxing Liu et.al. | 2502.14178 | null |
2025-02-19 | A Baseline Method for Removing Invisible Image Watermarks using Deep Image Prior | Hengyue Liang et.al. | 2502.13998 | link |
2025-02-19 | Remote Sensing Semantic Segmentation Quality Assessment based on Vision Language Model | Huiying Shi et.al. | 2502.13990 | null |
2025-02-19 | A Lightweight Model for Perceptual Image Compression via Implicit Priors | Hao Wei et.al. | 2502.13988 | null |
2025-02-19 | An Overall Real-Time Mechanism for Classification and Quality Evaluation of Rice | Wanke Xia et.al. | 2502.13764 | null |
2025-02-19 | HawkBench: Investigating Resilience of RAG Methods on Stratified Information-Seeking Tasks | Hongjin Qian et.al. | 2502.13465 | null |
2025-02-19 | OGBoost: A Python Package for Ordinal Gradient Boosting | Mansour T. A. Sharabiani et.al. | 2502.13456 | null |
2025-02-18 | VUS: Effective and Efficient Accuracy Measures for Time-Series Anomaly Detection | Paul Boniol et.al. | 2502.13318 | link |
2025-02-18 | Optimal covering of rectangular grid graphs with tours of constrained length | Sergey Bereg et.al. | 2502.13306 | null |
2025-02-18 | Application of Context-dependent Interpretation of Biosignals Recognition to Control a Bionic Multifunctional Hand Prosthesis | Pawel Trajdos et.al. | 2502.13301 | null |
2025-02-18 | Enhancing Machine Learning Performance through Intelligent Data Quality Assessment: An Unsupervised Data-centric Framework | Manal Rahal et.al. | 2502.13198 | null |
2025-02-18 | GS-QA: Comprehensive Quality Assessment Benchmark for Gaussian Splatting View Synthesis | Pedro Martin et.al. | 2502.13196 | null |
2025-02-18 | Language Barriers: Evaluating Cross-Lingual Performance of CNN and Transformer Architectures for Speech Quality Estimation | Wafaa Wardah et.al. | 2502.13004 | null |
2025-02-18 | VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation | Xinlong Chen et.al. | 2502.12782 | link |
2025-02-18 | Efficient Machine Translation Corpus Generation: Integrating Human-in-the-Loop Post-Editing with Large Language Models | Kamer Ali Yuksel et.al. | 2502.12755 | link |
2025-02-18 | 3D Shape-to-Image Brownian Bridge Diffusion for Brain MRI Synthesis from Cortical Surfaces | Fabian Bongratz et.al. | 2502.12742 | null |
2025-02-18 | Translate Smart, not Hard: Cascaded Translation Systems with Quality-Aware Deferral | António Farinhas et.al. | 2502.12701 | null |
2025-02-19 | Spherical Dense Text-to-Image Synthesis | Timon Winter et.al. | 2502.12691 | null |
2025-02-18 | Design and Implementation of a Dual Uncrewed Surface Vessel Platform for Bathymetry Research under High-flow Conditions | Dinesh Kumar et.al. | 2502.12539 | null |
2025-02-18 | Comprehensive Assessment and Analysis for NSFW Content Erasure in Text-to-Image Diffusion Models | Die Chen et.al. | 2502.12527 | null |
2025-02-18 | Local Flaw Detection with Adaptive Pyramid Image Fusion Across Spatial Sampling Resolution for SWRs | Siyu You et.al. | 2502.12512 | null |
2025-02-17 | Token Communications: A Unified Framework for Cross-modal Context-aware Semantic Communications | Li Qiao et.al. | 2502.12096 | null |
2025-02-17 | Low-Rank Thinning | Annabelle Michael Carrell et.al. | 2502.12063 | link |
2025-02-17 | MultiFlow: A unified deep learning framework for multi-vessel classification, segmentation and clustering of phase-contrast MRI validated on a multi-site single ventricle patient cohort | Tina Yao et.al. | 2502.11993 | null |
2025-02-17 | Deep Spatio-Temporal Neural Network for Air Quality Reanalysis | Ammar Kheder et.al. | 2502.11941 | link |
2025-02-17 | No-reference geometry quality assessment for colorless point clouds via list-wise rank learning | Zheng Li et.al. | 2502.11726 | link |
2025-02-17 | The Worse The Better: Content-Aware Viewpoint Generation Network for Projection-related Point Cloud Quality Assessment | Zhiyong Su et.al. | 2502.11710 | link |
2025-02-17 | Assessing Correctness in LLM-Based Code Generation via Uncertainty Estimation | Arindam Sharma et.al. | 2502.11620 | null |
2025-02-17 | Syllables to Scenes: Literary-Guided Free-Viewpoint 3D Scene Synthesis from Japanese Haiku | Chunan Yu et.al. | 2502.11586 | null |
2025-02-18 | AI-Assisted Thin Section Image Processing for Pore-Throat Characterization in Tight Clastic Rocks | Muhammad Risha et.al. | 2502.11523 | null |
2025-02-17 | Semantically Robust Unsupervised Image Translation for Paired Remote Sensing Images | Sheng Fang et.al. | 2502.11468 | null |
2025-02-17 | HellaSwag-Pro: A Large-Scale Bilingual Benchmark for Evaluating the Robustness of LLMs in Commonsense Reasoning | Xiaoyuan Li et.al. | 2502.11393 | null |
2025-02-17 | A Physics-Informed Blur Learning Framework for Imaging Systems | Liqun Chen et.al. | 2502.11382 | link |
2025-02-17 | LLMs can Perform Multi-Dimensional Analytic Writing Assessments: A Case Study of L2 Graduate-Level Academic English Writing | Zhengxiang Wang et.al. | 2502.11368 | link |
2025-02-16 | Generating Skyline Datasets for Data Science Models | Mengying Wang et.al. | 2502.11262 | null |
2025-02-16 | Exploiting network optimization stability for enhanced PET image denoising using deep image prior | Fumio Hashimoto et.al. | 2502.11259 | null |
2025-02-16 | Are Generative Models Underconfident? An Embarrassingly Simple Quality Estimation Approach | Tu Anh Dinh et.al. | 2502.11115 | null |
2025-02-16 | Imaging current flow and injection in scalable graphene devices through NV-magnetometry | Kaj Dockx et.al. | 2502.11076 | null |
2025-02-15 | Automatic Quality Assessment of First Trimester Crown-Rump-Length Ultrasound Images | Sevim Cengiz et.al. | 2502.10908 | null |
2025-02-15 | AquaScope: Reliable Underwater Image Transmission on Mobile Devices | Beitong Tian et.al. | 2502.10891 | null |
2025-02-15 | E-3DGS: Event-Based Novel View Rendering of Large-Scale Scenes Using 3D Gaussian Splatting | Sohaib Zahid et.al. | 2502.10827 | null |
2025-02-14 | Large Language Models and Synthetic Data for Monitoring Dataset Mentions in Research Papers | Aivin V. Solatorio et.al. | 2502.10263 | link |
2025-02-14 | Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model | Guoqing Ma et.al. | 2502.10248 | link |
2025-02-14 | ProReco: A Process Discovery Recommender System | Tsung-Hao Huang et.al. | 2502.10230 | null |
2025-02-14 | RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control | Teng Li et.al. | 2502.10059 | null |
2025-02-14 | AffectSRNet : Facial Emotion-Aware Super-Resolution Network | Syed Sameen Ahmad Rizvi et.al. | 2502.09932 | null |
2025-02-14 | A Deep Learning Approach to Interface Color Quality Assessment in HCI | Shixiao Wang et.al. | 2502.09914 | null |
2025-02-14 | Compression-Aware One-Step Diffusion Model for JPEG Artifact Removal | Jinpei Guo et.al. | 2502.09873 | link |
2025-02-14 | Optimizing GPT for Video Understanding: Zero-Shot Performance and Prompt Engineering | Mark Beliaev et.al. | 2502.09573 | null |
2025-02-13 | Learned Correction Methods for Ultrasound Computed Tomography Imaging Using Simplified Physics Models | Luke Lozenski et.al. | 2502.09546 | null |
2025-02-13 | SQ-GAN: Semantic Image Communications Using Masked Vector Quantization | Francesco Pezone et.al. | 2502.09520 | link |
2025-02-13 | A Physics-Informed Deep Learning Model for MRI Brain Motion Correction | Mojtaba Safari et.al. | 2502.09296 | link |
2025-02-13 | ConsistentDreamer: View-Consistent Meshes Through Balanced Multi-View Gaussian Optimization | Onat Şahin et.al. | 2502.09278 | null |
2025-02-13 | PixLift: Accelerating Web Browsing via AI Upscaling | Yonas Atinafu et.al. | 2502.08995 | null |
2025-02-13 | Some problems of developing astrophysical equipment and combining it with optical telescopes | Edward Emelianov et.al. | 2502.08992 | null |
2025-02-13 | Dynamic watermarks in images generated by diffusion models | Yunzhuo Chen et.al. | 2502.08927 | null |
2025-02-12 | A procedure for assessing of machine health index data prediction quality | Daniel Kuzio et.al. | 2502.08837 | null |
2025-02-12 | Ultrasound imaging of cortical bone: cortex geometry and measurement of porosity based on wave speed for bone remodeling estimation | Amadou S. Dia et.al. | 2502.08824 | null |
2025-02-12 | Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation | Hoigi Seo et.al. | 2502.08690 | null |
2025-02-12 | Light-A-Video: Training-free Video Relighting via Progressive Light Fusion | Yujie Zhou et.al. | 2502.08590 | link |
2025-02-12 | Quality-Aware Decoding: Unifying Quality Estimation and Decoding | Sai Koneru et.al. | 2502.08561 | null |
2025-02-12 | A Survey on Image Quality Assessment: Insights, Analysis, and Future Outlook | Chengqian Ma et.al. | 2502.08540 | null |
2025-02-12 | TuMag: the tunable magnetograph for the Sunrise III mission | J. C. del Toro Iniesta et.al. | 2502.08268 | null |
2025-02-12 | Forward and Inverse Problems in Nonlinear Acoustics | Barbara Kaltenbacher et.al. | 2502.08194 | null |
2025-02-11 | Automatic Prostate Volume Estimation in Transabdominal Ultrasound Images | Tiziano Natali et.al. | 2502.07859 | null |
2025-02-11 | Magic 1-For-1: Generating One Minute Video Clips within One Minute | Hongwei Yi et.al. | 2502.07701 | link |
2025-02-11 | An Improved Optimal Proximal Gradient Algorithm for Non-Blind Image Deblurring | Qingsong Wang et.al. | 2502.07602 | null |
2025-02-13 | Enhance-A-Video: Better Generated Video for Free | Yang Luo et.al. | 2502.07508 | link |
2025-02-11 | Compound Mask for Divergent Wave Imaging in Medical Ultrasound | Zahraa Alzein et.al. | 2502.07453 | null |
2025-02-11 | On Iterative Evaluation and Enhancement of Code Quality Using GPT-4o | Rundong Liu et.al. | 2502.07399 | link |
2025-02-11 | USRNet: Unified Scene Recovery Network for Enhancing Traffic Imaging under Multiple Adverse Weather Conditions | Yuxu Lu et.al. | 2502.07372 | link |
2025-02-11 | Multi-Task-oriented Nighttime Haze Imaging Enhancer for Vision-driven Measurement Systems | Ai Chen et.al. | 2502.07351 | link |
2025-02-11 | Playmate: Flexible Control of Portrait Animation via 3D-Implicit Space Guided Diffusion | Xingpei Ma et.al. | 2502.07203 | null |
2025-02-11 | HDCompression: Hybrid-Diffusion Image Compression for Ultra-Low Bitrates | Lei Lu et.al. | 2502.07160 | null |
2025-02-10 | Evaluation of Multilingual Image Captioning: How far can we get with CLIP models? | Gonçalo Gomes et.al. | 2502.06600 | link |
2025-02-10 | Image Intrinsic Scale Assessment: Bridging the Gap Between Quality and Resolution | Vlad Hosu et.al. | 2502.06476 | null |
2025-02-10 | How Humans Help LLMs: Assessing and Incentivizing Human Preference Annotators | Shang Liu et.al. | 2502.06387 | null |
2025-02-10 | Guidance-base Diffusion Models for Improving Photoacoustic Image Quality | Tatsuhiro Eguchi et.al. | 2502.06354 | null |
2025-02-10 | LANTERN++: Enhanced Relaxed Speculative Decoding with Static Tree Drafting for Visual Auto-regressive Models | Sihwan Park et.al. | 2502.06352 | link |
2025-02-10 | A CT Geometry With Multiple Centers Of Rotation For Solving Sparse View Problem | Jiayu Duan et.al. | 2502.06125 | null |
2025-02-10 | Token-Domain Multiple Access: Exploiting Semantic Orthogonality for Collision Mitigation | Li Qiao et.al. | 2502.06118 | null |
2025-02-09 | Dual Caption Preference Optimization for Diffusion Models | Amir Saeidi et.al. | 2502.06023 | link |
2025-02-09 | A Comprehensive Survey on Image Signal Processing Approaches for Low-Illumination Image Enhancement | Muhammad Turab et.al. | 2502.05995 | null |
2025-02-09 | Multi-Branch Collaborative Learning Network for Video Quality Assessment in Industrial Video Search | Hengzhu Tang et.al. | 2502.05924 | null |
2025-02-09 | Devil is in the Details: Density Guidance for Detail-Aware Generation with Flow Models | Rafał Karczewski et.al. | 2502.05807 | null |
2025-02-08 | Semantic-Aware Adaptive Video Streaming Using Latent Diffusion Models for Wireless Networks | Zijiang Yan et.al. | 2502.05695 | null |
2025-02-08 | FreeBlend: Advancing Concept Blending with Staged Feedback-Driven Interpolation Diffusion | Yufan Zhou et.al. | 2502.05606 | null |
2025-02-07 | Distillation and Pruning for Scalable Self-Supervised Representation-Based Speech Quality Assessment | Benjamin Stahl et.al. | 2502.05356 | link |
2025-02-07 | AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting | Chung-Ho Wu et.al. | 2502.05176 | null |
2025-02-07 | Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound | Andros Tjandra et.al. | 2502.05139 | link |
2025-02-07 | Cached Multi-Lora Composition for Multi-Concept Image Generation | Xiandong Zou et.al. | 2502.04923 | link |
2025-02-07 | Integration Concept of the CBM Micro Vertex Detector | Franz Matejcek et.al. | 2502.04858 | null |
2025-02-06 | ADIFF: Explaining audio difference using natural language | Soham Deshmukh et.al. | 2502.04476 | link |
2025-02-05 | DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization | Zhenglin Zhou et.al. | 2502.04370 | null |
2025-02-06 | BOUQuET: dataset, Benchmark and Open initiative for Universal Quality Evaluation in Translation | The Omnilingual MT Team et.al. | 2502.04314 | null |
2025-02-06 | Content-Rich AIGC Video Quality Assessment via Intricate Text Alignment and Motion-Aware Consistency | Shangkun Sun et.al. | 2502.04076 | link |
2025-02-06 | DICE: Distilling Classifier-Free Guidance into Text Embeddings | Zhenyu Zhou et.al. | 2502.03726 | null |
2025-02-05 | Quasi-Monte Carlo Methods: What, Why, and How? | Fred J. Hickernell et.al. | 2502.03644 | null |
2025-02-05 | Efficient Image Restoration via Latent Consistency Flow Matching | Elad Cohen et.al. | 2502.03500 | null |
2025-02-05 | A new method for structural diagnostics with muon tomography and deep learning | Lorenzo Pezzotti et.al. | 2502.03339 | null |
2025-02-05 | A Framework for Measuring the Quality of Infrastructure-as-Code Scripts | Pandu Ranga Reddy Konala et.al. | 2502.03127 | null |
2025-02-05 | Poisson Flow Joint Model for Multiphase contrast-enhanced CT | Rongjun Ge et.al. | 2502.03079 | null |
2025-02-05 | A Decade of Action Quality Assessment: Largest Systematic Survey of Trends, Challenges, and Future Directions | Hao Yin et.al. | 2502.02817 | null |
2025-02-04 | Muographic Image Upsampling with Machine Learning for Built Infrastructure Applications | William O'Donnell et.al. | 2502.02624 | null |
2025-02-04 | A comparison of translation performance between DeepL and Supertext | Alex Flückiger et.al. | 2502.02577 | link |
2025-02-04 | Privacy Attacks on Image AutoRegressive Models | Antoni Kowalczuk et.al. | 2502.02514 | link |
2025-02-04 | VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models | Hila Chefer et.al. | 2502.02492 | null |
2025-02-04 | High-Fidelity Human Avatars from Laptop Webcams using Edge Compute | Akash Haridas et.al. | 2502.02468 | null |
2025-02-04 | Exploring the Feasibility of AI-Assisted Spine MRI Protocol Optimization Using DICOM Image Metadata | Alice Vian et.al. | 2502.02351 | null |
2025-02-04 | When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks | Felix Drinkall et.al. | 2502.02199 | link |
2025-02-04 | PALQA: A Novel Parameterized Position-Aware Lossy Quantum Autoencoder using LSB Control Qubit for Efficient Image Compression | Ershadul Haque et.al. | 2502.02188 | null |
2025-02-05 | IPO: Iterative Preference Optimization for Text-to-Video Generation | Xiaomeng Yang et.al. | 2502.02088 | null |
2025-02-03 | Spectra of He isotopes and the $^3$He/$^4$ He ratio | M. J. Boschini et.al. | 2502.01887 | null |
2025-02-03 | Sparse Measurement Medical CT Reconstruction using Multi-Fused Block Matching Denoising Priors | Maliha Hossain et.al. | 2502.01832 | null |
2025-02-03 | Generating Multi-Image Synthetic Data for Text-to-Image Customization | Nupur Kumari et.al. | 2502.01720 | null |
2025-02-03 | CLIP-DQA: Blindly Evaluating Dehazed Images from Global and Local Perspectives Using CLIP | Yirui Zeng et.al. | 2502.01707 | null |
2025-02-03 | Proposal and Evaluation of a Practical CBCT Dose Optimization Method | S. Gros et.al. | 2502.01509 | null |
2025-02-03 | Human Body Restoration with One-Step Diffusion Model and A New Benchmark | Jue Gong et.al. | 2502.01411 | null |
2025-02-03 | Explainability-Driven Quality Assessment for Rule-Based Systems | Oshani Seneviratne et.al. | 2502.01253 | null |
2025-02-03 | Imaging simulation of a dual-panel PET geometry with ultrafast TOF detectors | Taiyo Ishikawa et.al. | 2502.01006 | null |
2025-02-02 | Weak Supervision Dynamic KL-Weighted Diffusion Models Guided by Large Language Models | Julian Perry et.al. | 2502.00826 | null |
2025-02-02 | EmoTalkingGaussian: Continuous Emotion-conditioned Talking Head Synthesis | Junuk Cha et.al. | 2502.00654 | null |
2025-02-01 | Deep Task-Based Beamforming and Channel Data Augmentations for Enhanced Ultrasound Imaging | Ariel Amar et.al. | 2502.00524 | null |
2025-02-01 | A framework for river connectivity classification using temporal image processing and attention based neural networks | Timothy James Becker et.al. | 2502.00474 | null |
2025-01-31 | Trust and Trustworthiness from Human-Centered Perspective in HRI -- A Systematic Literature Review | Debora Firmino de Souza et.al. | 2501.19323 | null |
2025-01-31 | Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search | Yuta Oshima et.al. | 2501.19252 | null |
2025-01-31 | Ambient Denoising Diffusion Generative Adversarial Networks for Establishing Stochastic Object Models from Noisy Image Data | Xichen Xu et.al. | 2501.19094 | null |
2025-01-31 | OmniPhysGS: 3D Constitutive Gaussians for General Physics-Based Dynamics Generation | Yuchen Lin et.al. | 2501.18982 | null |
2025-01-31 | Distorting Embedding Space for Safety: A Defense Mechanism for Adversarially Robust Diffusion Models | Jaesin Ahn et.al. | 2501.18877 | link |
2025-01-29 | Fake News Detection After LLM Laundering: Measurement and Explanation | Rupak Kumar Das et.al. | 2501.18649 | link |
2025-01-31 | Task-based Regularization in Penalized Least-Squares for Binary Signal Detection Tasks in Medical Image Denoising | Wentao Chen et.al. | 2501.18418 | null |
2025-01-30 | Adaptive Video Streaming with AI-Based Optimization for Dynamic Network Conditions | Mohammad Tarik et.al. | 2501.18332 | null |
2025-01-30 | AGAV-Rater: Adapting Large Multimodal Model for AI-Generated Audio-Visual Quality Assessment | Yuqin Cao et.al. | 2501.18314 | null |
2025-02-03 | Efficient Feature Fusion for UAV Object Detection | Xudong Wang et.al. | 2501.17983 | link |
2025-01-29 | Discrete Dielectric Coatings for Length Control and Tunability of Half-Wave Dipole Antennas at 300 MHz Magnetic Resonance Imaging Applications | Aditya A Bhosale et.al. | 2501.17954 | null |
2025-01-29 | Leveraging In-Context Learning and Retrieval-Augmented Generation for Automatic Question Generation in Educational Domains | Subhankar Maity et.al. | 2501.17397 | null |
2025-01-29 | On the Coexistence and Ensembling of Watermarks | Aleksandar Petrov et.al. | 2501.17356 | link |
2025-01-28 | Giving the Old a Fresh Spin: Quality Estimation-Assisted Constrained Decoding for Automatic Post-Editing | Sourabh Deoghare et.al. | 2501.17265 | null |
2025-01-27 | Audio Large Language Models Can Be Descriptive Speech Quality Evaluators | Chen Chen et.al. | 2501.17202 | null |
2025-01-31 | IC-Portrait: In-Context Matching for View-Consistent Personalized Portrait | Han Yang et.al. | 2501.17159 | null |
2025-01-28 | Three-Dimensional Diffusion-Weighted Multi-Slab MRI With Slice Profile Compensation Using Deep Energy Model | Reza Ghorbani et.al. | 2501.17152 | null |
2025-01-28 | Evaluating CrowdSplat: Perceived Level of Detail for Gaussian Crowds | Xiaohan Sun et.al. | 2501.17085 | null |
2025-01-28 | EdgeMLOps: Operationalizing ML models with Cumulocity IoT and thin-edge.io for Visual quality Inspection | Kanishk Chaturvedi et.al. | 2501.17062 | null |
2025-01-28 | EZOA: Nançay HI follow-up observations in the Zone of Avoidance | A. C. Schröder et.al. | 2501.17038 | null |
2025-01-28 | Image-Space Gridding for Nonrigid Motion-Corrected MR Image Reconstruction | Kwang Eun Jang et.al. | 2501.16713 | null |
2025-01-25 | MambaTron: Efficient Cross-Modal Point Cloud Enhancement using Aggregate Selective State Space Modeling | Sai Tarun Inaganti et.al. | 2501.16384 | null |
2025-01-27 | Adaptive Iterative Compression for High-Resolution Files: an Approach Focused on Preserving Visual Quality in Cinematic Workflows | Leonardo Melo et.al. | 2501.16319 | null |
2025-01-27 | UDBE: Unsupervised Diffusion-based Brightness Enhancement in Underwater Images | Tatiana Taís Schein et.al. | 2501.16211 | link |
2025-01-27 | Skeleton-Guided-Translation: A Benchmarking Framework for Code Repository Translation with Fine-Grained Quality Evaluation | Xing Zhang et.al. | 2501.16050 | null |
2025-01-30 | Can Location Embeddings Enhance Super-Resolution of Satellite Imagery? | Daniel Panangian et.al. | 2501.15847 | null |
2025-01-26 | Advancing quantum imaging through learning theory | Yunkai Wang et.al. | 2501.15685 | null |
2025-01-26 | Radiologist-in-the-Loop Self-Training for Generalizable CT Metal Artifact Reduction | Chenglong Ma et.al. | 2501.15610 | link |
2025-01-26 | Differentiable Low-computation Global Correlation Loss for Monotonicity Evaluation in Quality Assessment | Yipeng Liu et.al. | 2501.15485 | null |
2025-01-25 | Image formation theory of optical coherence tomography with optical aberrations and its application for computational aberration correction | Shuichi Makita et.al. | 2501.15011 | null |
2025-01-24 | SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation | Yujian Liu et.al. | 2501.14646 | null |
2025-01-24 | WanJuanSiLu: A High-Quality Open-Source Webtext Dataset for Low-Resource Languages | Jia Yu et.al. | 2501.14506 | link |
2025-01-24 | Enhancing Intelligibility for Generative Target Speech Extraction via Joint Optimization with Target Speaker ASR | Hao Ma et.al. | 2501.14477 | null |
2025-01-24 | Deep Learning-Powered Classification of Thoracic Diseases in Chest X-Rays | Yiming Lei et.al. | 2501.14279 | null |
2025-01-24 | CDI: Blind Image Restoration Fidelity Evaluation based on Consistency with Degraded Image | Xiaojun Tang et.al. | 2501.14264 | null |
2025-01-24 | GreedyPixel: Fine-Grained Black-Box Adversarial Attack Via Greedy Algorithm | Hanrui Wang et.al. | 2501.14230 | null |
2025-01-24 | Sparse Mixture-of-Experts for Non-Uniform Noise Reduction in MRI Images | Zeyun Deng et.al. | 2501.14198 | null |
2025-01-24 | VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking | Runyi Hu et.al. | 2501.14195 | link |
2025-01-23 | AdEval: Alignment-based Dynamic Evaluation to Mitigate Data Contamination in Large Language Models | Yang Fan et.al. | 2501.13983 | null |
2025-01-23 | Improving Video Generation with Human Feedback | Jie Liu et.al. | 2501.13918 | null |
2025-01-23 | VARFVV: View-Adaptive Real-Time Interactive Free-View Video Streaming with Edge Computing | Qiang Hu et.al. | 2501.13630 | link |
2025-01-23 | Diffusion-based Perceptual Neural Video Compression with Temporal Diffusion Information Reuse | Wenzhuo Ma et.al. | 2501.13528 | null |
2025-01-23 | LDR-Net: A Novel Framework for AI-generated Image Detection via Localized Discrepancy Representation | JiaXin Chen et.al. | 2501.13475 | null |
2025-01-23 | From Images to Point Clouds: An Efficient Solution for Cross-media Blind Quality Assessment without Annotated Training | Yipeng Liu et.al. | 2501.13387 | null |
2025-01-23 | Enhanced Extractor-Selector Framework and Symmetrization Weighted Binary Cross-Entropy for Edge Detections | Hao Shu et.al. | 2501.13365 | null |
2025-01-22 | UniRestore: Unified Perceptual and Task-Oriented Image Restoration Model Using Diffusion Prior | I-Hsiang Chen et.al. | 2501.13134 | null |
2025-01-23 | Accelerate High-Quality Diffusion Models with Inner Loop Feedback | Matthew Gwilliam et.al. | 2501.13107 | null |
2025-01-22 | Real-time Terahertz Compressive Optical-Digital Neural Network Imaging | Shao-Hsuan Wu et.al. | 2501.13065 | null |
2025-01-22 | Sketch and Patch: Efficient 3D Gaussian Representation for Man-Made Scenes | Yuang Shi et.al. | 2501.13045 | null |
2025-01-22 | Characterizing Collective Efforts in Content Sharing and Quality Control for ADHD-relevant Content on Video-sharing Platforms | Hanxiu 'Hazel' Zhu et.al. | 2501.13020 | null |
2025-01-22 | Paper Quality Assessment based on Individual Wisdom Metrics from Open Peer Review | Andrii Zahorodnii et.al. | 2501.13014 | null |
2025-01-22 | SoundSpring: Loss-Resilient Audio Transceiver with Dual-Functional Masked Language Modeling | Shengshi Yao et.al. | 2501.12696 | null |
2025-01-22 | Approximate Puzzlepiece Compositing | Xuan Huang et.al. | 2501.12581 | null |
2025-01-21 | Interaction Dataset of Autonomous Vehicles with Traffic Lights and Signs | Zheng Li et.al. | 2501.12536 | null |
2025-01-21 | Bidirectional Brain Image Translation using Transfer Learning from Generic Pre-trained Models | Fatima Haimour et.al. | 2501.12488 | null |
2025-01-21 | DiffDoctor: Diagnosing Image Diffusion Models Before Treating | Yiyang Wang et.al. | 2501.12382 | null |
2025-01-21 | Regressor-Guided Image Editing Regulates Emotional Response to Reduce Online Engagement | Christoph Gebhardt et.al. | 2501.12289 | null |
2025-01-21 | A Dynamic Programming Framework for Generating Approximately Diverse and Optimal Solutions | Waldo Gálvez et.al. | 2501.12261 | null |
2025-01-21 | Joint Reconstruction and Motion Estimation in Sparse-View 4DCT Using Diffusion Models within a Blind Inverse Problem Framework | Antoine De Paepe et.al. | 2501.12249 | null |
2025-01-21 | DLEN: Dual Branch of Transformer for Low-Light Image Enhancement in Dual Domains | Junyu Xia et.al. | 2501.12235 | null |
2025-01-21 | RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression | Uri Gadot et.al. | 2501.12216 | null |
2025-01-21 | Fast-RF-Shimming: Accelerate RF Shimming in 7T MRI using Deep Learning | Zhengyi Lu et.al. | 2501.12157 | null |
2025-01-21 | A Multi-annotated and Multi-modal Dataset for Wide-angle Video Quality Assessment | Bo Hu et.al. | 2501.12082 | null |
2025-01-22 | GSVC: Efficient Video Representation and Compression Through 2D Gaussian Splatting | Longan Wang et.al. | 2501.12060 | null |
2025-01-21 | Power Amplifier-Aware Transmit Power Optimization for OFDM and SC-FDMA Systems | Pawel Kryszkiewicz et.al. | 2501.11994 | null |
2025-01-21 | Bayesian Despeckling of Structured Sources | Ali Zafari et.al. | 2501.11860 | null |
2025-01-20 | EfficientVITON: An Efficient Virtual Try-On Model using Optimized Diffusion Process | Mostafa Atef et.al. | 2501.11776 | null |
2025-01-20 | Teaching Large Language Models to Regress Accurate Image Quality Scores using Score Distribution | Zhiyuan You et.al. | 2501.11561 | null |
2025-01-20 | Fundus Image Quality Assessment and Enhancement: a Systematic Review | Heng Li et.al. | 2501.11520 | null |
2025-01-20 | Multitask Auxiliary Network for Perceptual Quality Assessment of Non-Uniformly Distorted Omnidirectional Images | Jiebin Yan et.al. | 2501.11512 | link |
2025-01-20 | Subjective and Objective Quality Assessment of Non-Uniformly Distorted Omnidirectional Images | Jiebin Yan et.al. | 2501.11511 | link |
2025-01-20 | See In Detail: Enhancing Sparse-view 3D Gaussian Splatting with Local Depth and Semantic Regularization | Zongqi He et.al. | 2501.11508 | null |
2025-01-20 | Advancing Oyster Phenotype Segmentation with Multi-Network Ensemble and Multi-Scale mechanism | Wenli Yang et.al. | 2501.11203 | null |
2025-01-19 | Unit Region Encoding: A Unified and Compact Geometry-aware Representation for Floorplan Applications | Huichao Zhang et.al. | 2501.11097 | null |
2025-01-18 | EMO2: End-Effector Guided Audio-Driven Avatar Video Generation | Linrui Tian et.al. | 2501.10687 | null |
2025-01-17 | Fundamental mode power estimation through a |
Filipp Lausch et.al. | 2501.10345 | null |
2025-01-17 | DiffStereo: High-Frequency Aware Diffusion Model for Stereo Image Restoration | Huiyun Cao et.al. | 2501.10325 | null |
2025-01-17 | CSHNet: A Novel Information Asymmetric Image Translation Method | Xi Yang et.al. | 2501.10197 | link |
2025-01-17 | DiffVSR: Enhancing Real-World Video Super-Resolution with Diffusion Models for Advanced Visual Quality and Temporal Consistency | Xiaohui Li et.al. | 2501.10110 | null |
2025-01-17 | CLIP-PCQA: Exploring Subjective-Aligned Vision-Language Modeling for Point Cloud Quality Assessment | Yating Liu et.al. | 2501.10071 | link |
2025-01-17 | One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression | Keita Miwa et.al. | 2501.10064 | null |
2025-01-17 | CaFA: Cost-aware, Feasible Attacks With Database Constraints Against Neural Tabular Classifiers | Matan Ben-Tov et.al. | 2501.10013 | link |
2025-01-17 | IE-Bench: Advancing the Measurement of Text-Driven Image Editing for Human Perception Alignment | Shangkun Sun et.al. | 2501.09927 | null |
2025-01-17 | Decoding Patterns of Data Generation Teams for Clinical and Scientific Success: Insights from the Bridge2AI Talent Knowledge Graph | Jiawei Xu et.al. | 2501.09897 | null |
2025-01-16 | EraseBench: Understanding The Ripple Effects of Concept Erasure Techniques | Ibtihel Amara et.al. | 2501.09833 | null |
2025-01-16 | Scan-Adaptive MRI Undersampling Using Neighbor-based Optimization (SUNO) | Siddhant Gautam et.al. | 2501.09799 | link |
2025-01-16 | Evaluating Conversational Recommender Systems with Large Language Models: A User-Centric Evaluation Framework | Nuo Chen et.al. | 2501.09493 | null |
2025-01-16 | Joint Transmission and Deblurring: A Semantic Communication Approach Using Events | Pujing Yang et.al. | 2501.09396 | null |
2025-01-16 | PATCHEDSERVE: A Patch Management Framework for SLO-Optimized Hybrid Resolution Diffusion Serving | Desen Sun et.al. | 2501.09253 | null |
2025-01-16 | Estimating Task-based Performance Bounds for Accelerated MRI Image Reconstruction Methods by Use of Learned-Ideal Observers | Kaiyan Li et.al. | 2501.09224 | null |
2025-01-15 | UNIR-Net: A Novel Approach for Restoring Underwater Images with Non-Uniform Illumination Using Synthetic Data | Ezequiel Perez-Zarate et.al. | 2501.09053 | link |
2025-01-15 | Lights, Camera, Matching: The Role of Image Illumination in Fair Face Recognition | Gabriella Pangelinan et.al. | 2501.08910 | null |
2025-01-15 | XMusic: Towards a Generalized and Controllable Symbolic Music Generation Framework | Sida Tian et.al. | 2501.08809 | null |
2025-01-16 | Holoview: Interactive 3D visualization of medical data in AR | Pankaj Kaushik et.al. | 2501.08736 | null |
2025-01-15 | DynamicFace: High-Quality and Consistent Video Face Swapping using Composable 3D Facial Priors | Runqi Wang et.al. | 2501.08553 | null |
2025-01-15 | Comprehensive Subjective and Objective Evaluation Method for Text-generated Video | Zelu Qi et.al. | 2501.08545 | null |
2025-01-14 | Head Motion Degrades Machine Learning Classification of Alzheimer's Disease from Positron Emission Tomography | Eléonore V. Lieffrig et.al. | 2501.08459 | null |
2025-01-14 | Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models | Weichen Fan et.al. | 2501.08453 | null |
2025-01-14 | Cross-Modal Transferable Image-to-Video Attack on Video Quality Metrics | Georgii Gotin et.al. | 2501.08415 | link |
2025-01-14 | Rolling phase modulation regime for dynamic full field OCT | Tual Monfort et.al. | 2501.08359 | null |
2025-01-15 | Optical information encryption using general temporal ghost imaging with practical experimental condition | Juan Wu et.al. | 2501.08136 | null |
2025-01-13 | Evaluating Human Perception of Novel View Synthesis: Subjective Quality Assessment of Gaussian Splatting and NeRF in Dynamic Scenes | Yuhang Zhang et.al. | 2501.08072 | null |
2025-01-14 | VENOM: Text-driven Unrestricted Adversarial Example Generation with Diffusion Models | Hui Kuurila-Zhang et.al. | 2501.07922 | link |
2025-01-14 | Demographic Variability in Face Image Quality Measures | Wassim Kabbani et.al. | 2501.07898 | null |
2025-01-14 | State-of-the-Art Transformer Models for Image Super-Resolution: Techniques, Challenges, and Applications | Debasish Dutta et.al. | 2501.07855 | null |
2025-01-13 | FaceOracle: Chat with a Face Image Oracle | Wassim Kabbani et.al. | 2501.07202 | null |
2025-01-13 | Radial Distortion in Face Images: Detection and Impact | Wassim Kabbani et.al. | 2501.07179 | null |
2025-01-13 | Eye Sclera for Fair Face Image Quality Assessment | Wassim Kabbani et.al. | 2501.07158 | null |
2025-01-13 | Privacy-Preserving Data Quality Assessment for Time-Series IoT Sensors | Novoneel Chakraborty et.al. | 2501.07154 | null |
2025-01-13 | Video Quality Assessment for Online Processing: From Spatial to Temporal Sampling | Jiebin Yan et.al. | 2501.07087 | null |
2025-01-12 | Real-Time Neural-Enhancement for Online Cloud Gaming | Shan Jiang et.al. | 2501.06880 | null |
2025-01-14 | Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution | Du Chen et.al. | 2501.06838 | link |
2025-01-11 | NVS-SQA: Exploring Self-Supervised Quality Representation Learning for Neurally Synthesized Scenes without References | Qiang Qu et.al. | 2501.06488 | link |
2025-01-10 | VideoAuteur: Towards Long Narrative Video Generation | Junfei Xiao et.al. | 2501.06173 | null |
2025-01-10 | CamCtrl3D: Single-Image Scene Exploration with Precise 3D Camera Control | Stefan Popov et.al. | 2501.06006 | null |
2025-01-10 | Universal-2-TF: Robust All-Neural Text Formatting for ASR | Yash Khare et.al. | 2501.05948 | null |
2025-01-10 | UltraRay: Full-Path Ray Tracing for Enhancing Realism in Ultrasound Simulation | Felix Duelmer et.al. | 2501.05828 | null |
2025-01-13 | AI-Driven Diabetic Retinopathy Screening: Multicentric Validation of AIDRSS in India | Amit Kr Dey et.al. | 2501.05826 | null |
2025-01-10 | Conditional Diffusion Model for Electrical Impedance Tomography | Duanpeng Shi et.al. | 2501.05769 | null |
2025-01-10 | LLVD: LSTM-based Explicit Motion Modeling in Latent Space for Blind Video Denoising | Loay Rashid et.al. | 2501.05744 | null |
2025-01-10 | FIRM: Federated Image Reconstruction using Multimodal Tomographic Data | Geunyeong Byeon et.al. | 2501.05642 | null |
2025-01-09 | Interpretable deep learning illuminates multiple structures fluorescence imaging: a path toward trustworthy artificial intelligence in microscopy | Mingyang Chen et.al. | 2501.05490 | null |
2025-01-09 | Consistent Flow Distillation for Text-to-3D Generation | Runjie Yan et.al. | 2501.05445 | null |
2025-01-09 | Scaffold-SLAM: Structured 3D Gaussians for Simultaneous Localization and Photorealistic Mapping | Wen Tianci et.al. | 2501.05242 | null |
2025-01-09 | 3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering | Dewei Zhou et.al. | 2501.05131 | null |
2025-01-09 | TipSegNet: Fingertip Segmentation in Contactless Fingerprint Imaging | Laurenz Ruzicka et.al. | 2501.05076 | null |
2025-01-09 | Towards Fingerprint Mosaicking Artifact Detection: A Self-Supervised Deep Learning Approach | Laurenz Ruzicka et.al. | 2501.05034 | null |
2025-01-08 | Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling | Nannan Li et.al. | 2501.04666 | null |
2025-01-08 | Enhancing Low-Cost Video Editing with Lightweight Adaptors and Temporal-Aware Inversion | Yangfan He et.al. | 2501.04606 | link |
2025-01-08 | When LLMs Struggle: Reference-less Translation Evaluation for Low-resource Languages | Archchana Sindhujan et.al. | 2501.04473 | null |
2025-01-08 | Enhancing kidney quality assessment: Power Doppler during normothermic machine perfusion | Yitian Fang et.al. | 2501.04457 | null |
2025-01-08 | iFADIT: Invertible Face Anonymization via Disentangled Identity Transform | Lin Yuan et.al. | 2501.04390 | null |
2025-01-08 | DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models | Hyogon Ryu et.al. | 2501.04304 | link |
2025-01-07 | Spatiotemporal Gaussian Optimization for 4D Cone Beam CT Reconstruction from Sparse Projections | Yabo Fu et.al. | 2501.04140 | link |
2025-01-07 | Motion-Aware Generative Frame Interpolation | Guozhen Zhang et.al. | 2501.03699 | null |
2025-01-07 | Action Quality Assessment via Hierarchical Pose-guided Multi-stage Contrastive Regression | Mengshi Qi et.al. | 2501.03674 | link |
2025-01-07 | Deep Learning-based Compression Detection for explainable Face Image Quality Assessment | Laurin Jonientz et.al. | 2501.03619 | link |
2025-01-07 | A generative approach for lensless imaging in low-light conditions | Ziyang Liu et.al. | 2501.03511 | null |
2025-01-07 | Can Deep Learning Trigger Alerts from Mobile-Captured Images? | Pritisha Sarkar et.al. | 2501.03499 | null |
2025-01-06 | A Trust-Guided Approach to MR Image Reconstruction with Side Information | Arda Atalık et.al. | 2501.03021 | link |
2025-01-06 | Quality Estimation based Feedback Training for Improving Pronoun Translation | Harshit Dhankhar et.al. | 2501.03008 | null |
2025-01-06 | GLFC: Unified Global-Local Feature and Contrast Learning with Mamba-Enhanced UNet for Synthetic CT Generation from CBCT | Xianhao Zhou et.al. | 2501.02992 | link |
2025-01-06 | Region of Interest based Medical Image Compression | Utkarsh Prakash Srivastava et.al. | 2501.02895 | null |
2025-01-06 | COph100: A comprehensive fundus image registration dataset from infants constituting the "RIDIRP" database | Yan Hu et.al. | 2501.02800 | null |
2025-01-06 | Ultrasound-QBench: Can LLMs Aid in Quality Assessment of Ultrasound Imaging? | Hongyi Miao et.al. | 2501.02751 | null |
2025-01-06 | Brick-Diffusion: Generating Long Videos with Brick-to-Wall Denoising | Yunlong Yuan et.al. | 2501.02741 | null |
2025-01-06 | Artificial Intelligence in Creative Industries: Advances Prior to 2025 | Nantheera Anantrasirichai et.al. | 2501.02725 | null |
2025-01-06 | Multilevel Semantic-Aware Model for AI-Generated Video Quality Assessment | Jiaze Li et.al. | 2501.02706 | null |
2025-01-05 | DepthMaster: Taming Diffusion Models for Monocular Depth Estimation | Ziyang Song et.al. | 2501.02576 | link |
2025-01-05 | Multi-LLM Collaborative Caption Generation in Scientific Documents | Jaeyoung Kim et.al. | 2501.02552 | link |
2025-01-05 | Pixel-Wise Feature Selection for Perceptual Edge Detection without post-processing | Hao Shu et.al. | 2501.02534 | null |
2025-01-07 | ACE++: Instruction-Based Image Creation and Editing via Context-Aware Content Filling | Chaojie Mao et.al. | 2501.02487 | null |
2025-01-05 | Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module | Zhongjian Cui et.al. | 2501.02452 | null |
2025-01-05 | Journey into Automation: Image-Derived Pavement Texture Extraction and Evaluation | Bingjie Lu et.al. | 2501.02414 | null |
2025-01-04 | Optimizing Audio Compression Through Entropy-Controlled Dithering | Ellison Murray et.al. | 2501.02293 | null |
2025-01-04 | TDM: Temporally-Consistent Diffusion Model for All-in-One Real-World Video Restoration | Yizhou Li et.al. | 2501.02269 | null |
2025-01-04 | Exploring Secure Machine Learning Through Payload Injection and FGSM Attacks on ResNet-50 | Umesh Yadav et.al. | 2501.02147 | null |
2025-01-03 | JoyGen: Audio-Driven 3D Depth-Aware Talking-Face Video Editing | Qili Wang et.al. | 2501.01798 | link |
2025-01-03 | Multi-modal classification of forest biodiversity potential from 2D orthophotos and 3D airborne laser scanning point clouds | Simon B. Jensen et.al. | 2501.01728 | null |
2025-01-03 | Aesthetic Matters in Music Perception for Image Stylization: A Emotion-driven Music-to-Visual Manipulation | Junjie Xu et.al. | 2501.01700 | null |
2025-01-02 | A Metasemantic-Metapragmatic Framework for Taxonomizing Multimodal Communicative Alignment | Eugene Yu Ji et.al. | 2501.01535 | null |
2025-01-02 | Embedding Similarity Guided License Plate Super Resolution | Abderrezzaq Sendjasni et.al. | 2501.01483 | null |
2024-12-31 | Estimation of 3T MR images from 1.5T images regularized with Physics based Constraint | Prabhjot Kaur et.al. | 2501.01464 | null |
2024-12-31 | GDSR: Global-Detail Integration through Dual-Branch Network with Wavelet Losses for Remote Sensing Image Super-Resolution | Qiwei Zhu et.al. | 2501.01460 | null |
2025-01-02 | ScarNet: A Novel Foundation Model for Automated Myocardial Scar Quantification from LGE in Cardiac MRI | Neda Tavakoli et.al. | 2501.01372 | link |
2025-01-02 | TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions | Vriksha Srihari et.al. | 2501.01156 | null |
2025-01-02 | HarmonyIQA: Pioneering Benchmark and Model for Image Harmonization Quality Assessment | Zitong Xu et.al. | 2501.01116 | null |
2025-01-02 | Generalized Task-Driven Medical Image Quality Enhancement with Gradient Promotion | Dong Zhang et.al. | 2501.01114 | null |
2025-01-02 | EliGen: Entity-Level Controlled Image Generation with Regional Attention | Hong Zhang et.al. | 2501.01097 | link |
2025-01-02 | Enhancing Precision of Automated Teller Machines Network Quality Assessment: Machine Learning and Multi Classifier Fusion Approaches | Alireza Safarzadeh et.al. | 2501.01067 | null |
2025-01-01 | Deconstructing the emission order of protons, neutrons and |
Rohit Kumar et.al. | 2501.00963 | null |
2025-01-01 | Enhancing Early Diabetic Retinopathy Detection through Synthetic DR1 Image Generation: A StyleGAN3 Approach | Sagarnil Das et.al. | 2501.00954 | null |
2025-01-01 | SPADE: Enhancing Adaptive Cyber Deception Strategies with Generative AI and Structured Prompt Engineering | Shihab Ahmed et.al. | 2501.00940 | null |
2025-01-01 | Hierarchical Vision-Language Alignment for Text-to-Image Generation via Diffusion Models | Emily Johnson et.al. | 2501.00917 | null |
2025-01-01 | Text2Earth: Unlocking Text-driven Remote Sensing Image Generation with a Global-Scale Dataset and a Foundation Model | Chenyang Liu et.al. | 2501.00895 | null |
2025-01-01 | RORem: Training a Robust Object Remover with Human-in-the-Loop | Ruibin Li et.al. | 2501.00740 | link |
2024-12-31 | Token Pruning for Caching Better: 9 Times Acceleration on Stable Diffusion for Free | Evelyn Zhang et.al. | 2501.00375 | link |
2024-12-31 | SG-Splatting: Accelerating 3D Gaussian Splatting with Spherical Gaussians | Yiwen Wang et.al. | 2501.00342 | null |
2024-12-31 | Improving image quality of the Solar Disk Imager (SDI) of the Lyman-alpha Solar Telescope (LST) onboard the ASO-S mission | Hui Liu et.al. | 2501.00231 | null |
2024-12-30 | What Makes for a Good Stereoscopic Image? | Netanel Y. Tamir et.al. | 2412.21127 | null |
2024-12-30 | VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation | Jiazheng Xu et.al. | 2412.21059 | link |
2024-12-30 | DDIM sampling for Generative AIBIM, a faster intelligent structural design framework | Zhili He et.al. | 2412.20899 | null |
2024-12-30 | Acquisition-Independent Deep Learning for Quantitative MRI Parameter Estimation using Neural Controlled Differential Equations | Daan Kuppens et.al. | 2412.20844 | null |
2024-12-30 | 4D Gaussian Splatting: Modeling Dynamic Scenes with Native 4D Primitives | Zeyu Yang et.al. | 2412.20720 | null |
2024-12-29 | Single-image reflection removal via self-supervised diffusion models | Zhengyang Lu et.al. | 2412.20466 | null |
2024-12-29 | ESVQA: Perceptual Quality Assessment of Egocentric Spatial Videos | Xilei Zhu et.al. | 2412.20423 | null |
2024-12-29 | Bringing Objects to Life: 4D generation from 3D objects | Ohad Rahamim et.al. | 2412.20422 | null |
2024-12-28 | An Ordinary Differential Equation Sampler with Stochastic Start for Diffusion Bridge Models | Yuang Wang et.al. | 2412.19992 | null |
2024-12-27 | Structural Similarity in Deep Features: Image Quality Assessment Robust to Geometrically Disparate Reference | Keke Zhang et.al. | 2412.19553 | null |
2024-12-30 | DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT | Xiaotao Hu et.al. | 2412.19505 | link |
2024-12-27 | RAIN: Real-time Animation of Infinite Video Stream | Zhilei Shu et.al. | 2412.19489 | null |
2024-12-27 | Generative Adversarial Network on Motion-Blur Image Restoration | Zhengdong Li et.al. | 2412.19479 | null |
2024-12-27 | Adrenaline: Adaptive Rendering Optimization System for Scalable Cloud Gaming | Jin Heo et.al. | 2412.19446 | null |
2024-12-27 | The Hobby-Eberly Telescope Dark Energy Experiment Survey (HETDEX) Active Galactic Nuclei Catalog: the Fourth Data Release | Chenxu Liu et.al. | 2412.19414 | null |
2024-12-26 | Reflective Gaussian Splatting | Yuxuan Yao et.al. | 2412.19282 | null |
2024-12-26 | FineVQ: Fine-Grained User Generated Content Video Quality Assessment | Huiyu Duan et.al. | 2412.19238 | null |
2024-12-26 | FACEMUG: A Multimodal Generative and Fusion Framework for Local Facial Editing | Wanglong Lu et.al. | 2412.19009 | null |
2024-12-25 | TINQ: Temporal Inconsistency Guided Blind Video Quality Assessment | Yixiao Li et.al. | 2412.18933 | link |
2024-12-25 | ArtNVG: Content-Style Separated Artistic Neighboring-View Gaussian Stylization | Zixiao Gu et.al. | [2412.18783](http |