Skip to content
/ cv-arxiv-daily Public template
forked from Vincentqyw/cv-arxiv-daily

🎓Automatically Update CV Papers Daily using Github Actions (Update Every 12th hours)

License

Notifications You must be signed in to change notification settings

SKDDJ/cv-arxiv-daily

 
 

Repository files navigation

Updated on 2024.12.21

Table of Contents
  1. PEFT
  2. Text-to-Image Generation
  3. Vision-Language Models
  4. Generative Weight Space Modeling
  5. Data Distillation
  6. Schrodinger Bridge
  7. Dataset Distillation
  8. Synthetic Data Generation

PEFT

Publish Date Title Authors PDF Code
2024-12-19 FedPIA -- Permuting and Integrating Adapters leveraging Wasserstein Barycenters for Finetuning Foundation Models in Multi-Modal Federated Learning Pramit Saha et.al. 2412.14424 null
2024-12-18 Parameter-efficient Fine-tuning for improved Convolutional Baseline for Brain Tumor Segmentation in Sub-Saharan Africa Adult Glioma Dataset Bijay Adhikari et.al. 2412.14100 null
2024-12-18 A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Method-Level Code Smell Detection Beiqi Zhang et.al. 2412.13801 null
2024-12-18 Refining Salience-Aware Sparse Fine-Tuning Strategies for Language Models Xinxin Liu et.al. 2412.13488 null
2024-12-17 Train More Parameters But Mind Their Placement: Insights into Language Adaptation with PEFT Jenny Kunz et.al. 2412.12674 link
2024-12-16 Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering Jinhe Bi et.al. 2412.12359 link
2024-12-16 A LoRA is Worth a Thousand Pictures Chenxi Liu et.al. 2412.12048 null
2024-12-11 Adaptive Principal Components Allocation with the $\ell_{2,g}$ -regularized Gaussian Graphical Model for Efficient Fine-Tuning Large Models Jingjing Zheng et.al. 2412.08592 link
2024-12-10 PETALface: Parameter Efficient Transfer Learning for Low-resolution Face Recognition Kartik Narayan et.al. 2412.07771 null
2024-12-10 MoDULA: Mixture of Domain-Specific and Universal LoRA for Multi-Task Learning Yufei Ma et.al. 2412.07405 null
2024-12-13 Crack-EdgeSAM Self-Prompting Crack Segmentation System for Edge Devices Yingchu Wang et.al. 2412.07205 null
2024-12-08 Taming Sensitive Weights : Noise Perturbation Fine-tuning for Robust LLM Quantization Dongwei Wang et.al. 2412.06858 null
2024-12-09 BoRA: Bi-dimensional Weight-Decomposed Low-Rank Adaptation Qiushi Wang et.al. 2412.06441 null
2024-12-19 S $^{2}$ FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity Xinyu Yang et.al. 2412.06289 null
2024-12-08 KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models Fan Wang et.al. 2412.06071 link
2024-12-07 Training-Free Bayesianization for Low-Rank Adapters of Large Language Models Haizhou Shi et.al. 2412.05723 link
2024-12-06 PETapter: Leveraging PET-style classification heads for modular few-shot parameter-efficient fine-tuning Jonas Rieger et.al. 2412.04975 null
2024-12-04 Prompting Large Language Models for Clinical Temporal Relation Extraction Jianping He et.al. 2412.04512 null
2024-12-05 SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning Seokju Yun et.al. 2412.04077 link
2024-12-04 Improving Linguistic Diversity of Large Language Models with Possibility Exploration Fine-Tuning Long Mai et.al. 2412.03343 link
2024-12-03 Mixture of Physical Priors Adapter for Parameter-Efficient Fine-Tuning Zhaozhi Wang et.al. 2412.02759 null
2024-12-03 CPP-UT-Bench: Can LLMs Write Complex Unit Tests in C++? Vaishnavi Bhargava et.al. 2412.02735 null
2024-12-03 LoRA Diffusion: Zero-Shot LoRA Synthesis for Diffusion Model Personalization Ethan Smith et.al. 2412.02352 null
2024-12-03 A Comprehensive Evaluation of Large Language Models on Aspect-Based Sentiment Analysis Changzhi Zhou et.al. 2412.02279 null
2024-11-30 Unified Parameter-Efficient Unlearning for LLMs Chenlu Ding et.al. 2412.00383 null
2024-11-29 SURE-VQA: Systematic Understanding of Robustness Evaluation in Medical VQA Tasks Kim-Celine Kahl et.al. 2411.19688 link
2024-11-28 Parameter-Efficient Transfer Learning for Music Foundation Models Yiwei Ding et.al. 2411.19371 link
2024-11-28 PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning Shenghui Li et.al. 2411.19335 null
2024-11-28 Enhancing Parameter-Efficient Fine-Tuning of Vision Transformers through Frequency-Based Adaptation Son Thai Ly et.al. 2411.19297 link
2024-11-27 Challenges in Adapting Multilingual LLMs to Low-Resource Languages using LoRA PEFT Tuning Omkar Khade et.al. 2411.18571 null
2024-11-26 PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning Zhen Sun et.al. 2411.17453 null
2024-11-29 Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning Hui-Yue Yang et.al. 2411.17217 null
2024-11-25 Towards Efficient Model-Heterogeneity Federated Learning for Large Models Ruofan Jia et.al. 2411.16796 null
2024-11-25 Parameter Efficient Instruction Tuning: An Empirical Study Pengfei He et.al. 2411.16775 null
2024-11-25 Graph Adapter of EEG Foundation Models for Parameter Efficient Fine Tuning Toyotaro Suzumura et.al. 2411.16155 null
2024-11-24 Efficient and Private: Memorisation under differentially private parameter-efficient fine-tuning in language models Olivia Ma et.al. 2411.15831 null
2024-11-21 Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation Seokil Ham et.al. 2411.15224 null
2024-11-22 LoRA-FAIR: Federated LoRA Fine-Tuning with Aggregation and Initialization Refinement Jieming Bian et.al. 2411.14961 null
2024-11-21 Multi LoRA Meets Vision: Merging multiple adapters to create a multi task model Ege Kesim et.al. 2411.14064 null
2024-11-17 F $^3$ OCUS -- Federated Finetuning of Vision-Language Foundation Models with Optimal Client Layer Updating Strategy via Multi-objective Meta-Heuristics Pramit Saha et.al. 2411.11912 null
2024-11-16 HELENE: Hessian Layer-wise Clipping and Gradient Annealing for Accelerating Fine-tuning LLM with Zeroth-order Optimization Huaqin Zhao et.al. 2411.10696 null
2024-11-12 PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert Model Yilun Liu et.al. 2411.08212 null
2024-11-10 Prompt-Efficient Fine-Tuning for GPT-like Deep Models to Reduce Hallucination and to Improve Reproducibility in Scientific Text Generation Using Stochastic Optimisation Techniques Daniil Sulimov et.al. 2411.06445 null
2024-11-06 MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba Masakazu Yoshimura et.al. 2411.03855 null
2024-11-04 PipeLLM: Fast and Confidential Large Language Model Services with Speculative Pipelined Encryption Yifan Tan et.al. 2411.03357 null
2024-11-05 Efficient and Effective Adaptation of Multimodal Foundation Models in Sequential Recommendation Junchen Fu et.al. 2411.02992 null
2024-11-04 Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study André Storhaug et.al. 2411.02462 null
2024-11-04 Expanding Sparse Tuning for Low Memory Usage Shufan Shen et.al. 2411.01800 link
2024-11-15 Visual Fourier Prompt Tuning Runjia Zeng et.al. 2411.01327 link
2024-10-31 CleaR: Towards Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Label Learning Yeachan Kim et.al. 2411.00873 null
2024-10-30 FPE-LLM: Highly Intelligent Time-Series Forecasting and Language Interaction LLM in Energy Systems Zihang Qiu et.al. 2411.00852 null
2024-11-01 Dual Low-Rank Adaptation for Continual Learning with Pre-Trained Models Huancheng Chen et.al. 2411.00623 null
2024-11-01 Is Multiple Object Tracking a Matter of Specialization? Gianluca Mancusi et.al. 2411.00553 null
2024-11-01 C2A: Client-Customized Adaptation for Parameter-Efficient Federated Learning Yeachan Kim et.al. 2411.00311 link
2024-10-29 Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal Models Donghoon Kim et.al. 2411.00029 null
2024-10-30 Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation Wei Dong et.al. 2410.22952 null
2024-10-30 MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning Xujia Wang et.al. 2410.22782 null
2024-10-29 Meta-Learning Adaptable Foundation Models Jacob L. Block et.al. 2410.22264 null
2024-10-29 Capacity Control is an Effective Memorization Mitigation Mechanism in Text-Conditional Diffusion Models Raman Dutt et.al. 2410.22149 link
2024-10-30 IntLoRA: Integral Low-rank Adaptation of Quantized Diffusion Models Hang Guo et.al. 2410.21759 link
2024-10-28 KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge Distillation Rambod Azimi et.al. 2410.20777 link
2024-10-27 Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation Maohao Shen et.al. 2410.20336 null
2024-11-01 Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies Luping Wang et.al. 2410.19878 null
2024-10-23 MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning Jingfan Zhang et.al. 2410.18035 null
2024-10-22 Towards Real Zero-Shot Camouflaged Object Segmentation without Camouflaged Annotations Cheng Lei et.al. 2410.16953 null
2024-10-22 MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report Samrajya Thapa et.al. 2410.16239 link
2024-10-21 Natural GaLore: Accelerating GaLore for memory-efficient LLM Training and Fine-tuning Arijit Das et.al. 2410.16029 link
2024-10-18 Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation Shuai Zhao et.al. 2410.14425 link
2024-10-17 LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning Yiming Shi et.al. 2410.13618 link
2024-10-16 Communication-Efficient and Tensorized Federated Fine-Tuning of Large Language Models Sajjad Ghiasvand et.al. 2410.13097 null
2024-10-17 Prompt Compression for Large Language Models: A Survey Zongqian Li et.al. 2410.12388 link
2024-10-15 Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models Kai Yao et.al. 2410.11772 link
2024-10-15 LoKO: Low-Rank Kalman Optimizer for Online Fine-Tuning of Large Models Hossein Abdi et.al. 2410.11551 null
2024-10-15 RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates Md Kowsher et.al. 2410.10075 link
2024-10-13 BiDoRA: Bi-level Optimization-Based Weight-Decomposed Low-Rank Adaptation Peijia Qin et.al. 2410.09758 null
2024-10-12 Towards Efficient Visual-Language Alignment of the Q-Former for Visual Reasoning Tasks Sungkyung Kim et.al. 2410.09489 link
2024-10-15 MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning Yaming Yang et.al. 2410.09437 null
2024-10-09 Parameter-Efficient Fine-Tuning via Selective Discrete Cosine Transform Yixian Shen et.al. 2410.09103 null
2024-10-04 BIPEFT: Budget-Guided Iterative Search for Parameter Efficient Fine-Tuning of Large Pretrained Language Models Aofei Chang et.al. 2410.09079 null
2024-10-11 Parameter-Efficient Fine-Tuning of State Space Models Kevin Galim et.al. 2410.09016 link
2024-10-10 Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud Learning Dingkang Liang et.al. 2410.08114 link
2024-10-10 SLIM: Let LLM Learn More and Forget Less with Soft LoRA and Identity Mixture Jiayi Han et.al. 2410.07739 null
2024-10-10 Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures Yiming Chen et.al. 2410.07698 link
2024-10-09 SparseGrad: A Selective Method for Efficient Fine-tuning of MLP Layers Viktoriia Chekalina et.al. 2410.07383 link
2024-10-09 Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs Ruijia Niu et.al. 2410.06431 null
2024-10-08 Are Large Language Models State-of-the-art Quality Estimators for Machine Translation of User-generated Content? Shenbin Qian et.al. 2410.06338 link
2024-10-15 LoRTA: Low Rank Tensor Adaptation of Large Language Models Ignacio Hounie et.al. 2410.04060 null
2024-10-03 Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge Injection Tianxiang Chen et.al. 2410.02330 link
2024-10-02 TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning Large Language Models Zefang Liu et.al. 2410.02062 link
2024-10-02 NEAT: Nonlinear Parameter-efficient Adaptation of Pre-trained Models Yibo Zhong et.al. 2410.01870 null
2024-09-27 A GEN AI Framework for Medical Note Generation Hui Yi Leong et.al. 2410.01841 null
2024-10-02 DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models Yuxuan Zhang et.al. 2410.01497 link
2024-10-01 PrivTuner with Homomorphic Encryption and LoRA: A P3EFT Scheme for Privacy-Preserving Parameter-Efficient Fine-Tuning of AI Foundation Models Yang Li et.al. 2410.00433 null
2024-09-30 Adapting LLMs for the Medical Domain in Portuguese: A Study on Fine-Tuning and Model Evaluation Pedro Henrique Paiola et.al. 2410.00163 null
2024-09-30 Resource Allocation for Stable LLM Training in Mobile Edge Computing Chang Liu et.al. 2409.20247 null
2024-09-30 Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models Luohe Shi et.al. 2409.20181 null
2024-09-28 FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models Yucheng Xie et.al. 2409.19289 null
2024-10-01 Backdoor Attacks for LLMs with Weak-To-Strong Knowledge Distillation Shuai Zhao et.al. 2409.17946 null
2024-09-26 PEDRO: Parameter-Efficient Fine-tuning with Prompt DEpenDent Representation MOdification Tianfang Xie et.al. 2409.17834 null
2024-09-30 Efficient In-Domain Question Answering for Resource-Constrained Environments Isaac Chung et.al. 2409.17648 null
2024-10-07 PACE: marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization Yao Ni et.al. 2409.17137 link
2024-09-25 Parameter-efficient Bayesian Neural Networks for Uncertainty-aware Depth Estimation Richard D. Paul et.al. 2409.17085 null
2024-10-02 Bone: Block Affine Transformation as Parameter Efficient Fine-tuning Methods for Large Language Models Jiale Kang et.al. 2409.15371 link
2024-09-22 Flat-LoRA: Low-Rank Adaption over a Flat Loss Landscape Tao Li et.al. 2409.14396 null
2024-10-01 Obliviate: Neutralizing Task-agnostic Backdoors within the Parameter-efficient Fine-tuning Paradigm Jaehan Kim et.al. 2409.14119 link
2024-09-20 HUT: A More Computation Efficient Fine-Tuning Method With Hadamard Updated Transformation Geyuan Zhang et.al. 2409.13501 null
2024-09-17 THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models Mengfei Liang et.al. 2409.11353 link
2024-09-17 LPT++: Efficient Training on Mixture of Long-tailed Experts Bowen Dong et.al. 2409.11323 null
2024-09-17 Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models Divij Gupta et.al. 2409.11302 null
2024-09-18 Propulsion: Steering LLM with Tiny Fine-Tuning Md Kowsher et.al. 2409.10927 link
2024-09-16 From Text to Emoji: How PEFT-Driven Personality Manipulation Unleashes the Emoji Potential in LLMs Navya Jain et.al. 2409.10245 null
2024-09-14 COMFORT: A Continual Fine-Tuning Framework for Foundation Models Targeted at Consumer Healthcare Chia-Hao Li et.al. 2409.09549 null
2024-09-14 Comparing Retrieval-Augmentation and Parameter-Efficient Fine-Tuning for Privacy-Preserving Personalization of Large Language Models Alireza Salemi et.al. 2409.09510 link
2024-09-13 Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights Dixi Yao et.al. 2409.08482 null
2024-09-12 Do Vision Foundation Models Enhance Domain Generalization in Medical Image Segmentation? Kerem Cekmeceli et.al. 2409.07960 link
2024-09-11 Efficient Localized Adaptation of Neural Weather Forecasting: A Case Study in the MENA Region Muhammad Akhtar Munir et.al. 2409.07585 link
2024-09-10 Sam2Rad: A Segmentation Model for Medical Images with Learnable Prompts Assefa Seyoum Wahd et.al. 2409.06821 link
2024-09-11 Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models Yao Shu et.al. 2409.06277 link
2024-09-09 SVFit: Parameter-Efficient Fine-Tuning of Large Pre-Trained Models Using Singular Values Chengwei Sun et.al. 2409.05926 null
2024-09-10 Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment Zhixian Zhao et.al. 2409.05015 null
2024-09-06 Customizing Large Language Model Generation Style using Parameter-Efficient Finetuning Xinyue Liu et.al. 2409.04574 null
2024-09-04 iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation Hayeon Jo et.al. 2409.02838 null
2024-09-04 Deconfounded Causality-aware Parameter-Efficient Fine-Tuning for Problem-Solving Improvement of LLMs Ruoyu Wang et.al. 2409.02686 null
2024-09-04 Robust Federated Finetuning of Foundation Models via Alternating Minimization of LoRA Shuangyi Chen et.al. 2409.02346 null
2024-09-02 Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning Chongjie Si et.al. 2409.01035 link
2024-08-28 3-in-1: 2D Rotary Adaptation for Efficient Finetuning, Efficient Batching and Composability Baohao Liao et.al. 2409.00119 link
2024-08-21 SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models Yang Cao et.al. 2409.00055 link
2024-08-30 MoRe Fine-Tuning with 10x Fewer Parameters Wenxuan Tan et.al. 2408.17383 link
2024-09-02 Instant Adversarial Purification with Adversarial Consistency Distillation Chun Tong Lei et.al. 2408.17064 null
2024-08-28 Scaling Up Summarization: Leveraging Large Language Models for Long Text Extractive Summarization Léo Hemamou et.al. 2408.15801 null
2024-08-27 GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs Maxim Zhelnin et.al. 2408.15300 link
2024-08-27 Pre-training Everywhere: Parameter-Efficient Fine-Tuning for Medical Image Analysis via Target Parameter Pre-training Xingliang Lei et.al. 2408.15011 null
2024-08-27 CVPT: Cross-Attention help Visual Prompt Tuning adapt visual task Lingyun Huang et.al. 2408.14961 link
2024-08-27 Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models Aradhye Agarwal et.al. 2408.14470 link
2024-08-24 Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings Sagar Srinivas Sakhinana et.al. 2408.13622 null
2024-08-21 Positional Prompt Tuning for Efficient 3D Representation Learning Shaochen Zhang et.al. 2408.11567 link
2024-08-20 Pluto and Charon: A Time and Memory Efficient Collaborative Edge AI Framework for Personal LLMs Fine-Tuning Bei Ouyang et.al. 2408.10746 null
2024-08-20 TDS-CLIP: Temporal Difference Side Network for Image-to-Video Transfer Learning Bin Wang et.al. 2408.10688 link
2024-08-19 TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition Tianwei Lin et.al. 2408.09856 link
2024-08-16 Learning to Route for Dynamic Adapter Composition in Continual Learning with Language Models Vladimir Araujo et.al. 2408.09053 null
2024-08-14 KIND: Knowledge Integration and Diversion in Diffusion Models Yucheng Xie et.al. 2408.07337 null
2024-08-30 TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning Yujie Feng et.al. 2408.05200 link
2024-08-08 Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models Yupeng Chang et.al. 2408.04556 link
2024-08-06 SARA: Singular-Value Based Adaptive Low-Rank Adaption Jihao Gu et.al. 2408.03290 null
2024-08-06 Leveraging Parameter Efficient Training Methods for Low Resource Text Classification: A Case Study in Marathi Pranita Deshmukh et.al. 2408.03172 null
2024-08-03 TS-SAM: Fine-Tuning Segment-Anything Model for Downstream Tasks Yang Yu et.al. 2408.01835 link
2024-08-02 MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a Mixture of Dyadic Experts Lin Ning et.al. 2408.01505 null
2024-08-02 Tensor Train Low-rank Approximation (TT-LoRA): Democratizing AI with Accelerated LLMs Afia Anjum et.al. 2408.01008 null
2024-07-31 A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation Mothilal Asokan et.al. 2407.21739 null
2024-07-28 Forecast-PEFT: Parameter-Efficient Fine-Tuning for Pre-trained Motion Forecasting Models Jifeng Wang et.al. 2407.19564 link
2024-07-24 Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective Jingren Liu et.al. 2407.17120 null
2024-07-22 Zero-Shot Embeddings Inform Learning and Forgetting with Vision-Language Encoders Laura Niss et.al. 2407.15731 null
2024-07-21 Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization Jiajun Hu et.al. 2407.15085 null
2024-07-16 InstructAV: Instruction Fine-tuning Large Language Models for Authorship Verification Yujia Hu et.al. 2407.12882 link
2024-07-18 Turning Generative Models Degenerate: The Power of Data Poisoning Attacks Shuli Jiang et.al. 2407.12281 null
2024-07-16 Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification Naif Alkhunaizi et.al. 2407.11573 null
2024-07-16 An efficient framework based on large foundation model for cervical cytopathology whole slide image screening Jialong Huang et.al. 2407.11486 link
2024-07-10 RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization Xijie Huang et.al. 2407.08044 link
2024-07-10 ROSA: Random Subspace Adaptation for Efficient Fine-Tuning Marawan Gamal Abdel Hameed et.al. 2407.07802 link
2024-07-10 Parameter Efficient Fine Tuning for Multi-scanner PET to PET Reconstruction Yumin Kim et.al. 2407.07517 null
2024-07-09 Reprogramming Distillation for Medical Foundation Models Yuhang Zhou et.al. 2407.06504 null
2024-07-07 See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition Chongjie Si et.al. 2407.05417 link
2024-07-16 LoRA-GA: Low-Rank Adaptation with Gradient Approximation Shaowen Wang et.al. 2407.05000 link
2024-07-05 GPT vs RETRO: Exploring the Intersection of Retrieval and Parameter-Efficient Fine-Tuning Aleksander Ficek et.al. 2407.04528 null
2024-07-04 Deep Content Understanding Toward Entity and Aspect Target Sentiment Analysis on Foundation Models Vorakit Vorakitphan et.al. 2407.04050 link
2024-07-04 ASteISR: Adapting Single Image Super-resolution Pre-trained Model for Efficient Stereo Image Super-resolution Yuanbo Zhou et.al. 2407.03598 null
2024-07-03 Knowledge Composition using Task Vectors with Learned Anisotropic Scaling Frederic Z. Zhang et.al. 2407.02880 link
2024-07-03 Exploring the Capabilities of LLMs for Code Change Related Tasks Lishui Fan et.al. 2407.02824 link
2024-07-02 FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs Haodong Chen et.al. 2407.02157 null
2024-07-02 CatMemo at the FinLLM Challenge Task: Fine-Tuning Large Language Models using Data Fusion in Financial Applications Yupeng Cao et.al. 2407.01953 null
2024-07-05 Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models Zihan Wang et.al. 2407.01906 link
2024-07-01 A Fingerprint for Large Language Models Zhiguang Yang et.al. 2407.01235 null
2024-07-02 Embedded Prompt Tuning: Towards Enhanced Calibration of Pretrained Models for Medical Images Wenqiang Zu et.al. 2407.01003 link
2024-06-25 Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning Arijit Sehanobish et.al. 2406.17740 null
2024-06-19 Parameter Training Efficiency Aware Resource Allocation for AIGC in Space-Air-Ground Integrated Networks Liangxin Qian et.al. 2406.13602 null
2024-06-19 Sparse High Rank Adapters Kartikeya Bhardwaj et.al. 2406.13175 null
2024-06-18 Bayesian-LoRA: LoRA based Parameter Efficient Fine-Tuning using Optimal Quantization levels and Rank Values trough Differentiable Bayesian Gates Cristian Meo et.al. 2406.13046 null
2024-06-18 Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation Branislav Pecher et.al. 2406.12471 link
2024-06-17 A Semantic-based Layer Freezing Approach to Efficient Fine-Tuning of Language Models Jian Gu et.al. 2406.11753 null
2024-06-16 ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts Samar Khanna et.al. 2406.10973 null
2024-06-16 ShareLoRA: Parameter Efficient and Robust Large Language Model Fine-tuning via Shared Low-Rank Adaptation Yurun Song et.al. 2406.10785 null
2024-06-16 RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning Haoyu Wang et.al. 2406.10777 null
2024-06-15 Benchmarking Children's ASR with Supervised and Self-supervised Speech Foundation Models Ruchao Fan et.al. 2406.10507 link
2024-06-15 Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts Zhaoxuan Tan et.al. 2406.10471 link
2024-06-13 Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models Lukas Thede et.al. 2406.09384 null
2024-06-12 Exploring Fact Memorization and Style Imitation in LLMs Using QLoRA: An Experimental Study and Quality Assessment Methods Eugene Vyborov et.al. 2406.08582 null
2024-06-12 The Impact of Initialization on LoRA Finetuning Dynamics Soufiane Hayou et.al. 2406.08447 null
2024-06-20 Low-Rank Quantization-Aware Training for LLMs Yelysei Bondarenko et.al. 2406.06385 link
2024-06-10 A Parameter-efficient Language Extension Framework for Multilingual ASR Wei Liu et.al. 2406.06329 null
2024-06-09 A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Automated Program Repair Guochang Li et.al. 2406.05639 link
2024-06-07 Efficient Differentially Private Fine-Tuning of Diffusion Models Jing Liu et.al. 2406.05257 null
2024-06-07 CorDA: Context-Oriented Decomposition Adaptation of Large Language Models Yibo Yang et.al. 2406.05223 link
2024-06-07 An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models Xiongtao Zhou et.al. 2406.05130 link
2024-06-07 MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter Jitai Hao et.al. 2406.04984 link
2024-06-06 Time Sensitive Knowledge Editing through Efficient Finetuning Xiou Ge et.al. 2406.04496 link
2024-06-06 VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation Prashanth Vijayaraghavan et.al. 2406.04379 null
2024-06-10 Hypernetworks for Personalizing ASR to Atypical Speech Max Müller-Eberstein et.al. 2406.04240 null
2024-06-06 Light-PEFT: Lightening Parameter-Efficient Fine-Tuning via Early Pruning Naibin Gu et.al. 2406.03792 link
2024-06-05 Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need Martin Wistuba et.al. 2406.03216 null
2024-06-06 Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision Minglei Li et.al. 2406.03051 null
2024-05-31 Mamba State-Space Models Can Be Strong Downstream Learners John T. Halloran et.al. 2406.00209 null
2024-05-30 ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections Massimo Bini et.al. 2405.20271 link
2024-05-30 SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors Vijay Lingam et.al. 2405.19597 link
2024-05-29 MemControl: Mitigating Memorization in Medical Diffusion Models via Automated Parameter Selection Raman Dutt et.al. 2405.19458 link
2024-05-29 MLAE: Masked LoRA Experts for Parameter-Efficient Fine-Tuning Junjie Wang et.al. 2405.18897 link
2024-05-29 Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation Zelin Peng et.al. 2405.18840 null
2024-06-01 Low-Rank Few-Shot Adaptation of Vision-Language Models Maxime Zanella et.al. 2405.18541 null
2024-05-28 Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning Renzhi Wang et.al. 2405.18292 null
2024-05-28 VeLoRA: Memory Efficient Training using Rank-1 Sub-Token Projections Roy Miles et.al. 2405.17991 link
2024-05-28 Sparsity- and Hybridity-Inspired Visual Parameter-Efficient Fine-Tuning for Medical Diagnosis Mingyuan Liu et.al. 2405.17877 null
2024-05-27 LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters Klaudia Bałazy et.al. 2405.17604 link
2024-05-23 EMR-Merging: Tuning-Free High-Performance Model Merging Chenyu Huang et.al. 2405.17461 link
2024-05-28 DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution Yulong Mao et.al. 2405.17357 link
2024-05-27 $\textit{Trans-LoRA}$ : towards data-free Transferable Parameter Efficient Finetuning Runqian Wang et.al. 2405.17258 null
2024-05-30 Sparse Matrix in Large Language Model Fine-tuning Haoze He et.al. 2405.15525 null
2024-05-24 Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation Abhinav Jain et.al. 2405.15282 link
2024-05-27 VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks Yang Li et.al. 2405.15179 link
2024-05-23 Bitune: Bidirectional Instruction-Tuning Dawid J. Kopiczko et.al. 2405.14862 null
2024-05-23 Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference Ting Liu et.al. 2405.14700 link
2024-05-22 Spectral Adapter: Fine-Tuning in Spectral Space Fangzhao Zhang et.al. 2405.13952 link
2024-05-24 MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models Jingwei Xu et.al. 2405.13053 link
2024-05-20 FeTT: Continual Class Incremental Learning via Feature Transformation Tuning Sunyuan Qiang et.al. 2405.11822 null
2024-05-21 HARIS: Human-Like Attention for Reference Image Segmentation Mengxi Zhang et.al. 2405.10707 null
2024-05-28 DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation Jie Xu et.al. 2405.06368 null
2024-05-09 Selective Fine-tuning on LLM-labeled Data May Reduce Reliance on Human Annotation: A Case Study Using Schedule-of-Event Table Detection Bhawesh Kumar et.al. 2405.06093 null
2024-05-09 Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning Shibo Jie et.al. 2405.05615 link
2024-05-07 Refining Joint Text and Source Code Embeddings for Retrieval Task with Parameter-Efficient Fine-Tuning Karim Galliamov et.al. 2405.04126 link
2024-05-04 Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning Jing Xu et.al. 2405.02596 link
2024-03-16 Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R Amirreza Esmaeili et.al. 2405.01553 null
2024-05-02 NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment Gerald Shen et.al. 2405.01481 link
2024-04-29 LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Justin Zhao et.al. 2405.00732 link
2024-05-01 Investigating Automatic Scoring and Feedback using Large Language Models Gloria Ashiya Katuka et.al. 2405.00602 null
2024-05-01 MoPEFT: A Mixture-of-PEFTs for the Segment Anything Model Rajat Sahay et.al. 2405.00293 null
2024-04-30 SPAFIT: Stratified Progressive Adaptation Fine-tuning for Pre-trained Large Language Models Samir Arora et.al. 2405.00201 null
2024-05-23 HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning Chunlin Tian et.al. 2404.19245 link
2024-05-25 FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition Yuxuan Yan et.al. 2404.18848 null
2024-04-25 Efficiency in Focus: LayerNorm as a Catalyst for Fine-tuning Medical Visual Language Pre-trained Models Jiawei Chen et.al. 2404.16385 null
2024-05-23 MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts Dengchun Li et.al. 2404.15159 link
2024-04-22 ColA: Collaborative Adaptation with Gradient Learning Enmao Diao et.al. 2404.13844 link
2024-04-23 Parameter Efficient Fine Tuning: A Comprehensive Analysis Across Applications Charith Chandra Sai Balne et.al. 2404.13506 null
2024-04-18 SKIP: Skill-Localized Prompt Tuning for Inference Speed Boost-Up Nakyeong Yang et.al. 2404.11916 null
2024-04-16 Shears: Unstructured Sparsity with Neural Low-rank Adapter Search J. Pablo Muñoz et.al. 2404.10934 link
2024-04-16 Exact and Efficient Unlearning for Large Language Model-based Recommendation Zhiyu Hu et.al. 2404.10327 null
2024-04-15 LoRA Dropout as a Sparsity Regularizer for Overfitting Control Yang Lin et.al. 2404.09610 null
2024-04-21 Analyzing the Impact of Data Selection and Fine-Tuning on Economic and Political Biases in LLMs Ahmed Agiza et.al. 2404.08699 link
2024-04-08 Certified PEFTSmoothing: Parameter-Efficient Fine-Tuning with Randomized Smoothing Chengyan Fu et.al. 2404.05350 null
2024-04-08 DLoRA: Distributed Parameter-Efficient Fine-Tuning Solution for Large Language Model Chao Gao et.al. 2404.05182 null
2024-04-12 Q-PEFT: Query-dependent Parameter Efficient Fine-tuning for Text Reranking with Large Language Models Zhiyuan Peng et.al. 2404.04522 null
2024-04-05 Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation Tong Su et.al. 2404.04212 null
2024-05-22 ReFT: Representation Finetuning for Language Models Zhengxuan Wu et.al. 2404.03592 link
2024-06-11 Personalized LLM Response Generation with Parameterized Memory Injection Kai Zhang et.al. 2404.03565 null
2024-06-20 Eigenpruning: an Interpretability-Inspired PEFT Method Tomás Vergara-Browne et.al. 2404.03147 link
2024-05-28 PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models Fanxu Meng et.al. 2404.02948 link
2024-04-03 Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data Parth Patwa et.al. 2404.02422 null
2024-04-11 IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT Junchen Fu et.al. 2404.02059 link
2024-03-31 Query-driven Relevant Paragraph Extraction from Legal Judgments T. Y. S. S Santosh et.al. 2404.00595 null
2024-03-30 Edinburgh Clinical NLP at SemEval-2024 Task 2: Fine-tune your model unless you have access to GPT-4 Aryo Pradipta Gema et.al. 2404.00484 link
2024-04-03 InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning Yan-Shuo Liang et.al. 2404.00228 link
2024-03-27 Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation Mateusz Klimaszewski et.al. 2403.18804 link
2024-03-26 The Unreasonable Ineffectiveness of the Deeper Layers Andrey Gromov et.al. 2403.17887 null
2024-04-15 ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models Zequan Liu et.al. 2403.16187 null
2024-03-22 KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation Xindi Luo et.al. 2403.14950 link
2024-03-22 A Single Linear Layer Yields Task-Adapted Low-Rank Matrices Hwichan Kim et.al. 2403.14946 null
2024-03-21 AutoRE: Document-Level Relation Extraction with Large Language Models Xue Lilong et.al. 2403.14888 link
2024-04-29 Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey Zeyu Han et.al. 2403.14608 null
2024-03-20 Harnessing Large Language Models for Text-Rich Sequential Recommendation Zhi Zheng et.al. 2403.13325 link
2024-04-16 AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models Zeyu Liu et.al. 2403.13269 null
2024-03-18 Improving LoRA in Privacy-preserving Federated Learning Youbang Sun et.al. 2403.12313 null
2024-03-18 Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation Wangbo Zhao et.al. 2403.11808 link
2024-03-18 Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model Haoyun Xu et.al. 2403.11621 null
2024-03-19 JORA: JAX Tensor-Parallel LoRA Library for Retrieval Augmented Fine-Tuning Anique Tahir et.al. 2403.11366 link
2024-03-14 Introducing Routing Functions to Vision-Language Parameter-Efficient Fine-Tuning with Low-Rank Bottlenecks Tingyu Qu et.al. 2403.09377 link
2024-03-14 PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation Yizhe Xiong et.al. 2403.09192 link
2024-03-13 Data-oriented Dynamic Fine-tuning Parameter Selection Strategy for FISH Mask based Efficient Fine-tuning Ming Dong et.al. 2403.08484 null

(back to top)

Text-to-Image Generation

Publish Date Title Authors PDF Code
2024-12-19 LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis Hanlin Wang et.al. 2412.15214 null
2024-12-19 Flowing from Words to Pixels: A Framework for Cross-Modality Evolution Qihao Liu et.al. 2412.15213 null
2024-12-19 Generative Multiview Relighting for 3D Reconstruction under Extreme Illumination Variation Hadi Alzayer et.al. 2412.15211 null
2024-12-19 AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation Moayed Haji-Ali et.al. 2412.15191 null
2024-12-19 LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation Weijia Shi et.al. 2412.15188 null
2024-12-19 Tiled Diffusion Or Madar et.al. 2412.15185 null
2024-12-19 SqueezeMe: Efficient Gaussian Avatars for VR Shunsuke Saito et.al. 2412.15171 null
2024-12-19 OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization Jiacheng Zhang et.al. 2412.15159 null
2024-12-19 Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM Yatai Ji et.al. 2412.15156 link
2024-12-19 Jet: A Modern Transformer-Based Normalizing Flow Alexander Kolesnikov et.al. 2412.15129 null
2024-12-19 Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation Yang Tian et.al. 2412.15109 null
2024-12-19 Learning Disentangled Equivariant Representation for Explicitly Controllable 3D Molecule Generation Haoran Liu et.al. 2412.15086 null
2024-12-19 Eigenstate Preparation on Quantum Computers Joey Bonitati et.al. 2412.15081 null
2024-12-19 Uni-Renderer: Unifying Rendering and Inverse Rendering Via Dual Stream Diffusion Zhifei Chen et.al. 2412.15050 null
2024-12-19 DCTdiff: Intriguing Properties of Image Generative Modeling in the DCT Space Mang Ning et.al. 2412.15032 link
2024-12-18 AniDoc: Animation Creation Made Easier Yihao Meng et.al. 2412.14173 null
2024-12-19 E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling Zhihang Yuan et.al. 2412.14170 null
2024-12-18 Autoregressive Video Generation without Vector Quantization Haoge Deng et.al. 2412.14169 link
2024-12-18 VideoDPO: Omni-Preference Alignment for Video Diffusion Generation Runtao Liu et.al. 2412.14167 null
2024-12-18 MetaMorph: Multimodal Understanding and Generation via Instruction Tuning Shengbang Tong et.al. 2412.14164 null
2024-12-18 MCMat: Multiview-Consistent and Physically Accurate PBR Material Generation Shenhao Zhu et.al. 2412.14148 null
2024-12-18 Event-based Photometric Bundle Adjustment Shuang Guo et.al. 2412.14111 null
2024-12-18 Future Research Avenues for Artificial Intelligence in Digital Gaming: An Exploratory Report Markus Dablander et.al. 2412.14085 null
2024-12-18 SurgSora: Decoupled RGBD-Flow Diffusion Model for Controllable Surgical Video Generation Tong Chen et.al. 2412.14018 null
2024-12-18 Comparative Analysis of Machine Learning-Based Imputation Techniques for Air Quality Datasets with High Missing Data Rates Sen Yan et.al. 2412.13966 null
2024-12-18 A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI Beiduo Chen et.al. 2412.13942 null
2024-12-18 Development of a High-Resolution, High-Dynamic-Range Charge Detector for Ion Beam Monitoring O. Adriani et.al. 2412.13934 null
2024-12-18 Investigating the Effects of Diffusion-based Conditional Generative Speech Models Used for Speech Enhancement on Dysarthric Speech Joanna Reszka et.al. 2412.13933 null
2024-12-18 Graph-Driven Models for Gas Mixture Identification and Concentration Estimation on Heterogeneous Sensor Array Signals Ding Wang et.al. 2412.13891 null
2024-12-18 Navigating limitations with precision: A fine-grained ensemble approach to wrist pathology recognition on a limited x-ray dataset Ammar Ahmed et.al. 2412.13884 null
2024-12-17 CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models Gaoyang Zhang et.al. 2412.13195 link
2024-12-17 StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models Yunzhi Yan et.al. 2412.13188 null
2024-12-17 Move-in-2D: 2D-Conditioned Human Motion Generation Hsin-Ping Huang et.al. 2412.13185 null
2024-12-17 F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration Lu Liu et.al. 2412.13155 null
2024-12-17 Prompt Augmentation for Self-supervised Text-guided Image Manipulation Rumeysa Bodur et.al. 2412.13081 null
2024-12-17 3D MedDiffusion: A 3D Medical Diffusion Model for Controllable and High-quality Medical Image Generation Haoshen Wang et.al. 2412.13059 null
2024-12-17 Guiding Generative Protein Language Models with Reinforcement Learning Filippo Stocco et.al. 2412.12979 null
2024-12-18 Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance Wenhao Sun et.al. 2412.12974 link
2024-12-17 ArchesWeather & ArchesWeatherGen: a deterministic and generative model for efficient ML weather forecasting Guillaume Couairon et.al. 2412.12971 link
2024-12-17 Modified UNIFAC 2.0 -- A Group-Contribution Method Completed with Machine Learning Nicolas Hayer et.al. 2412.12962 null
2024-12-17 MOPO: Multi-Objective Prompt Optimization for Affective Text Generation Yarik Menchaca Resendiz et.al. 2412.12948 null
2024-12-17 Generation of cosmic ray trajectories by a Diffusion Model trained on test particles in 3D magnetohydrodynamic turbulence Johannes Martin et.al. 2412.12923 null
2024-12-17 Unsupervised Region-Based Image Editing of Denoising Diffusion Models Zixiang Li et.al. 2412.12912 null
2024-12-18 ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction Zhongjie Duan et.al. 2412.12888 link
2024-12-17 Memory-minimal quantum generation of stochastic processes: spectral invariants of quantum hidden Markov models Magdalini Zonnios et.al. 2412.12812 null
2024-12-16 Causal Diffusion Transformers for Generative Modeling Chaorui Deng et.al. 2412.12095 link
2024-12-16 CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models Felix Taubner et.al. 2412.12093 null
2024-12-16 Wonderland: Navigating 3D Scenes from a Single Image Hanwen Liang et.al. 2412.12091 null
2024-12-16 A LoRA is Worth a Thousand Pictures Chenxi Liu et.al. 2412.12048 null
2024-12-16 LLMs for Cold-Start Cutting Plane Separator Configuration Connor Lawless et.al. 2412.12038 null
2024-12-16 Learning to Navigate in Mazes with Novel Layouts using Abstract Top-down Maps Linfeng Zhao et.al. 2412.12024 null
2024-12-16 The entropic optimal (self-)transport problem: Limit distributions for decreasing regularization with application to score function estimation Gilles Mordant et.al. 2412.12007 null
2024-12-16 Controllable Shadow Generation with Single-Step Diffusion Models from Synthetic Data Onur Tasar et.al. 2412.11972 null
2024-12-16 The Erdős unit distance problem for small point sets Boris Alexeev et.al. 2412.11914 null
2024-12-16 CharacterBench: Benchmarking Character Customization of Large Language Models Jinfeng Zhou et.al. 2412.11912 link
2024-12-16 Towards Understanding Systems Trade-offs in Retrieval-Augmented Generation Model Inference Michael Shen et.al. 2412.11854 null
2024-12-16 ColorFlow: Retrieval-Augmented Image Sequence Colorization Junhao Zhuang et.al. 2412.11815 null
2024-12-16 InterDyn: Controllable Interactive Dynamics with Video Diffusion Models Rick Akkerman et.al. 2412.11785 null
2024-12-16 Joint Reconstruction of the Activity and the Attenuation in PET by Diffusion Posterior Sampling: a Feasibility Study Clémentine Phung-Ngoc et.al. 2412.11776 null
2024-12-17 No More Adam: Learning Rate Scaling at Initialization is All You Need Minghao Xu et.al. 2412.11768 link
2024-12-13 Towards a foundation model for heavy-ion collision experiments through point cloud diffusion Manjunath Omana Kuttan et.al. 2412.10352 null
2024-12-13 BrushEdit: All-In-One Image Inpainting and Editing Yaowei Li et.al. 2412.10316 null
2024-12-13 Iterating the Transient Light Transport Matrix for Non-Line-of-Sight Imaging Talha Sultan et.al. 2412.10300 null
2024-12-13 Coherent 3D Scene Diffusion From a Single RGB Image Manuel Dahnert et.al. 2412.10294 null
2024-12-13 Adversarial Robustness of Bottleneck Injected Deep Neural Networks for Task-Oriented Communication Alireza Furutanpey et.al. 2412.10265 null
2024-12-13 Targeted Angular Reversal of Weights (TARS) for Knowledge Removal in Large Language Models Harry J. Davies et.al. 2412.10257 null
2024-12-13 Exploring the Frontiers of Animation Video Generation in the Sora Era: Method, Dataset and Benchmark Yudong Jiang et.al. 2412.10255 null
2024-12-13 Radiator Tailoring for Enhanced Performance in InAs-Based Near-Field Thermophotovoltaics Mathieu Giroux et.al. 2412.10217 null
2024-12-13 GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion Jiapeng Tang et.al. 2412.10209 null
2024-12-13 Efficient Generative Modeling with Residual Vector Quantization-Based Tokens Jaehyeon Kim et.al. 2412.10208 null
2024-12-13 Simple Guidance Mechanisms for Discrete Diffusion Models Yair Schiff et.al. 2412.10193 link
2024-12-13 SwiftTry: Fast and Consistent Video Virtual Try-On with Diffusion Models Hung Nguyen et.al. 2412.10178 null
2024-12-13 Learning payoffs while routing in skill-based queues Sanne van Kempen et.al. 2412.10168 null
2024-12-13 The Art of Deception: Color Visual Illusions and Diffusion Models Alex Gomez-Villa et.al. 2412.10122 null
2024-12-13 Familiarity: Better Evaluation of Zero-Shot Named Entity Recognition by Quantifying Label Shifts in Synthetic Training Data Jonas Golde et.al. 2412.10121 null
2024-12-12 FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion Haonan Qiu et.al. 2412.09626 null
2024-12-12 Illusion3D: 3D Multiview Illusion with 2D Diffusion Priors Yue Feng et.al. 2412.09625 null
2024-12-12 GenEx: Generating an Explorable World Taiming Lu et.al. 2412.09624 null
2024-12-12 OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation Weiqi Li et.al. 2412.09623 null
2024-12-12 LoRACLR: Contrastive Adaptation for Customization of Diffusion Models Enis Simsar et.al. 2412.09622 null
2024-12-12 SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training Dongting Hu et.al. 2412.09619 null
2024-12-12 EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM Zhuofan Zong et.al. 2412.09618 null
2024-12-12 Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG Kavana Venkatesh et.al. 2412.09614 null
2024-12-13 Olympus: A Universal Task Router for Computer Vision Tasks Yuanze Lin et.al. 2412.09612 link
2024-12-12 Owl-1: Omni World Model for Consistent Long Video Generation Yuanhui Huang et.al. 2412.09600 link
2024-12-12 LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors Yabo Chen et.al. 2412.09597 null
2024-12-12 Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion Zexin He et.al. 2412.09593 null
2024-12-12 Improving the Reliability of Cable Broadband Networks via Proactive Network Maintenance Jiyao Hu et.al. 2412.09564 null
2024-12-12 Meshtron: High-Fidelity, Artist-Like 3D Mesh Generation at Scale Zekun Hao et.al. 2412.09548 null
2024-12-12 SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing Xueting Li et.al. 2412.09545 null
2024-12-11 Generative Semantic Communication: Architectures, Technologies, and Applications Jinke Ren et.al. 2412.08642 null
2024-12-11 DMin: Scalable Training Data Influence Estimation for Diffusion Models Huawei Lin et.al. 2412.08637 link
2024-12-11 Multimodal Latent Language Modeling with Next-Token Diffusion Yutao Sun et.al. 2412.08635 link
2024-12-11 An SDR-Based Monostatic Wi-Fi System with Analog Self-Interference Cancellation for Sensing Andreas Toftegaard Kristensen et.al. 2412.08612 null
2024-12-12 Design2GarmentCode: Turning Design Concepts to Tangible Garments Through Program Synthesis Feng Zhou et.al. 2412.08603 null
2024-12-11 TryOffAnyone: Tiled Cloth Generation from a Dressed Person Ioannis Xarchakos et.al. 2412.08573 link
2024-12-12 Watermarking Training Data of Music Generation Models Pascal Epple et.al. 2412.08549 null
2024-12-11 Orderly Management of Packets in RDMA by Eunomia Sana Mahmood et.al. 2412.08540 null
2024-12-11 Ensemble-Based Quantum-Token Protocol Benchmarked on IBM Quantum Processors Lucas Tsunaki et.al. 2412.08530 null
2024-12-11 Comparative Opinion Mining in Product Reviews: Multi-perspective Prompt-based Learning Hai-Yen Thi Nguyen et.al. 2412.08508 null
2024-12-11 Open-Loop and Model Predictive Control for Electric Vehicle Charging to Manage Excess Renewable Energy Supply in Texas Kelsey M. Nelson et.al. 2412.08505 null
2024-12-11 Learning Flow Fields in Attention for Controllable Person Image Generation Zijian Zhou et.al. 2412.08486 link
2024-12-11 InvDiff: Invariant Guidance for Bias Mitigation in Diffusion Models Min Hou et.al. 2412.08480 link
2024-12-11 CC-Diff: Enhancing Contextual Coherence in Remote Sensing Image Synthesis Mu Zhang et.al. 2412.08464 null
2024-12-11 Federated Learning for Traffic Flow Prediction with Synthetic Data Augmentation Fermin Orozco et.al. 2412.08460 null
2024-12-10 Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets Zhen Liu et.al. 2412.07775 null
2024-12-10 UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics Xi Chen et.al. 2412.07774 null
2024-12-10 From Slow Bidirectional to Fast Causal Video Generators Tianwei Yin et.al. 2412.07772 null
2024-12-10 Make-A-Texture: Fast Shape-Aware Texture Generation in 3 Seconds Xiaoyu Xiang et.al. 2412.07766 null
2024-12-10 Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences Alan Nawzad Amin et.al. 2412.07763 link
2024-12-10 Repurposing Pre-trained Video Diffusion Models for Event-based Video Interpolation Jingxi Chen et.al. 2412.07761 null
2024-12-10 SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints Jianhong Bai et.al. 2412.07760 link
2024-12-10 PortraitTalk: Towards Customizable One-Shot Audio-to-Talking Face Generation Fatemeh Nazarieh et.al. 2412.07754 null
2024-12-10 Multi-Shot Character Consistency for Text-to-Video Generation Yuval Atzmon et.al. 2412.07750 null
2024-12-10 StyleMaster: Stylize Your Video with Artistic Generation and Translation Zixuan Ye et.al. 2412.07744 null
2024-12-10 STIV: Scalable Text and Image Conditioned Video Generation Zongyu Lin et.al. 2412.07730 null
2024-12-10 ObjCtrl-2.5D: Training-free Object Control with Camera Poses Zhouxia Wang et.al. 2412.07721 null
2024-12-10 ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer Jinyi Hu et.al. 2412.07720 link
2024-12-10 Privacy-Preserving Customer Support: A Framework for Secure and Scalable Interactions Anant Prakash Awasthi et.al. 2412.07687 null
2024-12-10 Optimizing Sensor Redundancy in Sequential Decision-Making Problems Jonas Nüßlein et.al. 2412.07686 null
2024-12-10 [MASK] is All You Need Vincent Tao Hu et.al. 2412.06787 link
2024-12-09 Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation Ruihan Gao et.al. 2412.06785 link
2024-12-09 Diverse Score Distillation Yanbo Xu et.al. 2412.06780 null
2024-12-09 Visual Lexicon: Rich Image Features in Language Space XuDong Wang et.al. 2412.06774 null
2024-12-09 InstantRestore: Single-Step Personalized Face Restoration with Shared-Image Attention Howard Zhang et.al. 2412.06753 null
2024-12-09 ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities Adhiraj Ghosh et.al. 2412.06745 null
2024-12-10 ContRail: A Framework for Realistic Railway Image Synthesis using ControlNet Andrei-Robert Alexandrescu et.al. 2412.06742 null
2024-12-09 Take Fake as Real: Realistic-like Robust Black-box Adversarial Attack to Evade AIGC Detection Caiyun Xie et.al. 2412.06727 link
2024-12-09 You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale Baorui Ma et.al. 2412.06699 link
2024-12-09 Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy Yuxuan Xue et.al. 2412.06698 null
2024-12-09 Diff5T: Benchmarking Human Brain Diffusion MRI with an Extensive 5.0 Tesla K-Space and Spatial Dataset Shanshan Wang et.al. 2412.06666 null
2024-12-09 Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion Shuaiting Li et.al. 2412.06661 null
2024-12-09 MVReward: Better Aligning and Evaluating Multi-View Diffusion Models with Human Preferences Weitao Wang et.al. 2412.06614 null
2024-12-09 Augmented reality for upper limb rehabilitation: real-time kinematic feedback with HoloLens 2 Beatrice Luciani et.al. 2412.06596 null
2024-12-09 EmoSpeech: A Corpus of Emotionally Rich and Contextually Detailed Speech Annotations Weizhen Bian et.al. 2412.06581 null
2024-12-06 Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model Lening Wang et.al. 2412.05280 link
2024-12-06 Perturb-and-Revise: Flexible 3D Editing with Generative Trajectories Susung Hong et.al. 2412.05279 null
2024-12-06 Birth and Death of a Rose Chen Geng et.al. 2412.05278 null
2024-12-06 MotionFlow: Attention-Driven Motion Transfer in Video Diffusion Models Tuna Han Salih Meral et.al. 2412.05275 null
2024-12-06 Go-or-Grow Models in Biology: a Monster on a Leash R. Thiessen et.al. 2412.05191 null
2024-12-06 Privacy Drift: Evolving Privacy Concerns in Incremental Learning Sayyed Farid Ahamed et.al. 2412.05183 null
2024-12-06 DNF: Unconditional 4D Generation with Dictionary-based Neural Fields Xinyi Zhang et.al. 2412.05161 null
2024-12-06 A text-to-tabular approach to generate synthetic patient data using LLMs Margaux Tornqvist et.al. 2412.05153 link
2024-12-06 LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation Donald Shenaj et.al. 2412.05148 null
2024-12-06 How to Squeeze An Explanation Out of Your Model Tiago Roxo et.al. 2412.05134 null
2024-12-06 Probabilistic Galaxy Field Generation with Diffusion Models Tanner Sether et.al. 2412.05131 null
2024-12-06 The Silent Prompt: Initial Noise as Implicit Guidance for Goal-Driven Image Generation Ruoyu Wang et.al. 2412.05101 null
2024-12-06 Reconstructing Quantitative Cerebral Perfusion Images Directly From Measured Sinogram Data Acquired Using C-arm Cone-Beam CT Haotian Zhao et.al. 2412.05084 null
2024-12-06 ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration Chi-Wei Hsiao et.al. 2412.05043 null
2024-12-06 Get It Right: Improving Comprehensibility with Adaptable Speech Expression of a Humanoid Service Robot Thomas Sievers et.al. 2412.05022 null
2024-12-05 PaintScene4D: Consistent 4D Scene Generation from Text Prompts Vinayak Gupta et.al. 2412.04471 null
2024-12-05 LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors Yusuf Dalva et.al. 2412.04460 null
2024-12-05 Four-Plane Factorized Video Autoencoders Mohammed Suhail et.al. 2412.04452 null
2024-12-05 MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation Longtao Zheng et.al. 2412.04448 null
2024-12-05 DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models Yizhuo Li et.al. 2412.04446 null
2024-12-05 Learning Artistic Signatures: Symmetry Discovery and Style Transfer Emma Finn et.al. 2412.04441 null
2024-12-05 GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration Kaiyi Huang et.al. 2412.04440 null
2024-12-05 Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation Yuying Ge et.al. 2412.04432 link
2024-12-05 Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis Jian Han et.al. 2412.04431 link
2024-12-05 Reversible molecular simulation for training classical and machine learning force fields Joe G Greener et.al. 2412.04374 link
2024-12-05 Machine Theory of Mind for Autonomous Cyber-Defence Luke Swaby et.al. 2412.04367 null
2024-12-05 ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation Dayoung Gong et.al. 2412.04353 null
2024-12-05 RMD: A Simple Baseline for More General Human Motion Generation via Training-free Retrieval-Augmented Motion Diffuse Zhouyingcheng Liao et.al. 2412.04343 null
2024-12-05 Likelihood-Scheduled Score-Based Generative Modeling for Fully 3D PET Image Reconstruction George Webber et.al. 2412.04339 null
2024-12-05 Multi-Subject Image Synthesis as a Generative Prior for Single-Subject PET Image Reconstruction George Webber et.al. 2412.04324 null
2024-12-04 Navigation World Models Amir Bar et.al. 2412.03572 null
2024-12-04 MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation Zehuan Huang et.al. 2412.03558 null
2024-12-04 NODE-AdvGAN: Improving the transferability and perceptual similarity of adversarial examples by dynamic-system-driven adversarial generative model Xinheng Xie et.al. 2412.03539 null
2024-12-04 NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images Lingen Li et.al. 2412.03517 null
2024-12-04 Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion Shengyuan Zhang et.al. 2412.03515 link
2024-12-04 Data Fusion of Semantic and Depth Information in the Context of Object Detection Md Abu Yusuf et.al. 2412.03490 null
2024-12-04 Flow Matching with General Discrete Paths: A Kinetic-Optimal Perspective Neta Shaul et.al. 2412.03487 null
2024-12-04 Pre-trained Multiple Latent Variable Generative Models are good defenders against Adversarial Attacks Dario Serez et.al. 2412.03453 link
2024-12-04 CleanDIFT: Diffusion Features without Noise Nick Stracke et.al. 2412.03439 link
2024-12-04 SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model Yan Li et.al. 2412.03430 null
2024-12-04 Skel3D: Skeleton Guided Novel View Synthesis Aron Fóthi et.al. 2412.03407 null
2024-12-04 Identifiability implies consistency of MLE in partially observed diffusions on a torus Ibrahim Ekren et.al. 2412.03380 null
2024-12-04 TASR: Timestep-Aware Diffusion Model for Image Super-Resolution Qinwei Lin et.al. 2412.03355 link
2024-12-04 DIVE: Taming DINO for Subject-Driven Video Editing Yi Huang et.al. 2412.03347 null
2024-12-04 Geometry-guided Cross-view Diffusion for One-to-many Cross-view Image Synthesis Tao Jun Lin et.al. 2412.03315 null
2024-12-03 Motion Prompting: Controlling Video Generation with Motion Trajectories Daniel Geng et.al. 2412.02700 null
2024-12-03 Diffusion-based Visual Anagram as Multi-task Learning Zhiyuan Xu et.al. 2412.02693 link
2024-12-03 FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation Kefan Chen et.al. 2412.02690 null
2024-12-04 SNOOPI: Supercharged One-step Diffusion Distillation with Proper Guidance Viet Nguyen et.al. 2412.02687 null
2024-12-03 AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction Lingteng Qiu et.al. 2412.02684 null
2024-12-03 Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation Yiftach Edelstein et.al. 2412.02631 null
2024-12-03 The effect of priors on Learning with Restricted Boltzmann Machines Gianluca Manzan et.al. 2412.02623 null
2024-12-03 ComPair-2: A Next Generation Medium Energy Gamma-ray Telescope Prototype Regina Caputo et.al. 2412.02562 null
2024-12-03 The Two-Center Problem of Uncertain Points on Cactus Graphs Haitao Xu et.al. 2412.02559 null
2024-12-03 ShadowHack: Hacking Shadows via Luminance-Color Divide and Conquer Jin Hu et.al. 2412.02545 link
2024-12-03 Unveiling Concept Attribution in Diffusion Models Quang H. Nguyen et.al. 2412.02542 null
2024-12-03 LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data Hanyu Zhang et.al. 2412.02525 null
2024-12-03 GerPS-Compare: Comparing NER methods for legal norm analysis Sarah T. Bachinger et.al. 2412.02427 null
2024-12-03 It Takes Two: Real-time Co-Speech Two-person's Interaction Generation via Reactive Auto-regressive Diffusion Model Mingyi Shi et.al. 2412.02419 null
2024-12-03 A Multi-Agent Framework for Extensible Structured Text Generation in PLCs Donghao Yang et.al. 2412.02410 null
2024-11-29 Nanostructured micrometric-pore membranes for nanofiltration: Micrometric geometry may optimize performance, energy efficiency and operational lifetime J. C. Verde et.al. 2411.19900 null
2024-11-29 Input-Output Optics as a Causal Time Series Mapping: A Generative Machine Learning Solution Abhijit Sen et.al. 2411.19897 null
2024-11-29 MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks Yiming Wu et.al. 2411.19786 null
2024-11-29 Riemannian Denoising Score Matching for Molecular Structure Optimization with Accurate Energy Jeheon Woo et.al. 2411.19769 null
2024-11-29 JetFormer: An Autoregressive Generative Model of Raw Images and Text Michael Tschannen et.al. 2411.19722 null
2024-11-29 Inverse Design of Mechanical Metamaterials Using a Point-Cloud-Based Deep Generative Model Seungwook Hong et.al. 2411.19681 null
2024-11-29 TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting Bojun Xiong et.al. 2411.19654 null
2024-11-29 Uniform Attention Maps: Boosting Image Fidelity in Reconstruction and Editing Wenyi Mo et.al. 2411.19652 link
2024-11-29 Enhancing Security in Third-Party Library Reuse -- Comprehensive Detection of 1-day Vulnerability through Code Patch Analysis Shangzhi Xu et.al. 2411.19648 null
2024-11-29 Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings Qiong Wu et.al. 2411.19628 link
2024-11-29 Unimib Assistant: designing a student-friendly RAG-based chatbot for all their needs Chiara Antico et.al. 2411.19554 null
2024-11-29 Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook Florinel-Alin Croitoru et.al. 2411.19537 link
2024-11-29 Quantized Delta Weight Is Safety Keeper Yule Liu et.al. 2411.19530 null
2024-12-02 DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding Jungbin Cho et.al. 2411.19527 null
2024-11-29 Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis Tianqi Li et.al. 2411.19509 null
2024-11-27 Textured Gaussians for Enhanced 3D Scene Appearance Modeling Brian Chao et.al. 2411.18625 null
2024-11-27 GeneMAN: Generalizable Single-Image 3D Human Reconstruction from Multi-Source Human Data Wentao Wang et.al. 2411.18624 null
2024-11-27 Diffusion Self-Distillation for Zero-Shot Customized Image Generation Shengqu Cai et.al. 2411.18616 null
2024-11-27 CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models Rundi Wu et.al. 2411.18613 null
2024-11-27 Evaluating and Improving the Effectiveness of Synthetic Chest X-Rays for Medical Image Analysis Eva Prakash et.al. 2411.18602 null
2024-11-27 Bit symmetry entails the symmetry of the quantum transition probability Gerd Niestegge et.al. 2411.18589 null
2024-11-27 Building Confidence in Deep Generative Protein Design Tianyuan Zheng et.al. 2411.18568 link
2024-11-27 High-throughput antibody screening with high-quality factor nanophotonics and bioprinting Sajjad Abdollahramezani et.al. 2411.18557 null
2024-11-27 FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion Haosen Yang et.al. 2411.18552 null
2024-11-28 Enhancing weed detection performance by means of GenAI-based image augmentation Sourav Modak et.al. 2411.18513 null
2024-11-27 GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation Pengfei Zhou et.al. 2411.18499 null
2024-11-27 Synthetic ECG Generation for Data Augmentation and Transfer Learning in Arrhythmia Classification José Fernando Núñez et.al. 2411.18456 null
2024-11-27 Is my Meeting Summary Good? Estimating Quality with a Multi-LLM Evaluator Frederic Kirstein et.al. 2411.18444 null
2024-11-27 Learning the Evolution of Physical Structure of Galaxies via Diffusion Models Andrew Lizarraga et.al. 2411.18440 link
2024-11-27 Search for heavy scalar or pseudoscalar states in $\mathrm{t \bar{t}}$ events at CMS Laurids Jeppe et.al. 2411.18414 null
2024-11-27 StableAnimator: High-Quality Identity-Preserving Human Image Animation Shuyuan Tu et.al. 2411.17697 link
2024-11-26 ScribbleLight: Single Image Indoor Relighting with Scribbles Jun Myeong Choi et.al. 2411.17696 null
2024-11-26 Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis Akshita Gupta et.al. 2411.17690 null
2024-11-26 GenDeg: Diffusion-Based Degradation Synthesis for Generalizable All-in-One Image Restoration Sudarshan Rajagopalan et.al. 2411.17687 null
2024-11-26 Semi-analytical model for the calculation of solar radiation pressure and its effects on a LEO satellite with predicting the change in position vectors using machine learning techniques Pranava Seth et.al. 2411.17626 null
2024-11-26 Accelerating Vision Diffusion Transformers with Skip Branches Guanjie Chen et.al. 2411.17616 link
2024-11-26 Mixed-State Quantum Denoising Diffusion Probabilistic Model Gino Kwun et.al. 2411.17608 null
2024-11-26 Making History Readable Bipasha Banerjee et.al. 2411.17600 null
2024-11-26 VideoDirector: Precise Video Editing via Text-to-Video Models Yukun Wang et.al. 2411.17592 null
2024-11-26 Rapid Deployment of Domain-specific Hyperspectral Image Processors with Application to Autonomous Driving Jon Gutiérrez-Zaballa et.al. 2411.17543 null
2024-11-26 Metaverse Innovation Canvas: A Tool for Extended Reality Product/Service Development Amir Reza Asadi et.al. 2411.17541 null
2024-11-26 IMPROVE: Improving Medical Plausibility without Reliance on HumanValidation -- An Enhanced Prototype-Guided Diffusion Framework Anurag Shandilya et.al. 2411.17535 null
2024-11-26 FTMoMamba: Motion Generation with Frequency and Text State Space Models Chengjian Li et.al. 2411.17532 null
2024-11-26 Exact and Heuristic Approaches for the Covering Tour Location Routing Problem Andreas Hagn et.al. 2411.17510 link
2024-11-26 WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model Zongjian Li et.al. 2411.17459 link
2024-11-25 Generative Omnimatte: Learning to Decompose Video into Layers Yao-Chih Lee et.al. 2411.16683 null
2024-11-25 Diffusion Features for Zero-Shot 6DoF Object Pose Estimation Bernd Von Gimborn et.al. 2411.16668 null
2024-11-25 DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation Zun Wang et.al. 2411.16657 null
2024-11-25 Exploring Discrete Flow Matching for 3D De Novo Molecule Generation Ian Dunn et.al. 2411.16644 link
2024-11-25 LegoPET: Hierarchical Feature Guided Conditional Diffusion for PET Image Reconstruction Yiran Sun et.al. 2411.16629 null
2024-11-25 Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models Ronghuan Wu et.al. 2411.16602 null
2024-11-25 Unlocking The Potential of Adaptive Attacks on Diffusion-Based Purification Andre Kassis et.al. 2411.16598 link
2024-11-25 Rethinking Diffusion for Text-Driven Human Motion Generation Zichong Meng et.al. 2411.16575 null
2024-11-25 Representation Collapsing Problems in Vector Quantization Wenhao Zhao et.al. 2411.16550 null
2024-11-25 ADOBI: Adaptive Diffusion Bridge For Blind Inverse Problems with Application to MRI Reconstruction Yuyang Hu et.al. 2411.16535 null
2024-11-25 PriorPath: Coarse-To-Fine Approach for Controlled De-Novo Pathology Semantic Masks Generation Nati Daniel et.al. 2411.16515 null
2024-11-25 Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis Boming Miao et.al. 2411.16503 null
2024-11-25 Multi-Resolution Generative Modeling of Human Motion from Limited Data David Eduardo Moreno-Villamarín et.al. 2411.16498 null
2024-11-25 Learning by Analogy: Enhancing Few-Shot Prompting for Math Word Problem Solving with Computational Graph-Based Retrieval Xiaocong Yang et.al. 2411.16454 null
2024-11-25 Model-based reinforcement corrosion prediction: Continuous calibration with Bayesian optimization and corrosion wire sensor data A. Potnis et.al. 2411.16447 null
2024-11-22 DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving Bencheng Liao et.al. 2411.15139 link
2024-11-22 Material Anything: Generating Materials for Any 3D Object via Diffusion Xin Huang et.al. 2411.15138 null
2024-11-22 VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement Daeun Lee et.al. 2411.15115 null
2024-11-22 RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts Hjalmar Wijk et.al. 2411.15114 link
2024-11-22 Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion Samarth N Ramesh et.al. 2411.15113 null
2024-11-22 Leapfrog Latent Consistency Model (LLCM) for Medical Images Generation Lakshmikar R. Polamreddy et.al. 2411.15084 link
2024-11-22 Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural Network Irfan Nafiz Shahan et.al. 2411.15082 link
2024-11-22 Empowering Clients: Transformation of Design Processes Due to Generative AI Johannes Schneider et.al. 2411.15061 null
2024-11-22 The 1D nonlocal Fisher-KPP equation with a top hat kernel. Part 3. The effect of perturbations in the kernel David John Needham et.al. 2411.15054 null
2024-11-22 FloAt: Flow Warping of Self-Attention for Clothing Animation Generation Swasti Shreya Mishra et.al. 2411.15028 null
2024-11-22 Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation Huy Le et.al. 2411.14913 null
2024-11-22 Dynamically Encircled Higher-order Exceptional Points in an Optical Fiber Arpan Roy et.al. 2411.14874 null
2024-11-22 Prioritize Denoising Steps on Diffusion Model Preference Alignment via Explicit Denoised Distribution Estimation Dingyuan Shi et.al. 2411.14871 null
2024-11-22 Latent Schrodinger Bridge: Prompting Latent Diffusion for Fast Unpaired Image-to-Image Translation Jeongsol Kim et.al. 2411.14863 null
2024-11-22 Style-Friendly SNR Sampler for Style-Driven Generation Jooyoung Choi et.al. 2411.14793 null
2024-11-21 Stable Flow: Vital Layers for Training-Free Image Editing Omri Avrahami et.al. 2411.14430 null
2024-11-21 Transformer-based Heuristic for Advanced Air Mobility Planning Jun Xiang et.al. 2411.14427 null
2024-11-21 A Python-Based Approach to Sputter Deposition Simulations in Combinatorial Materials Science Felix Thelen et.al. 2411.14413 null
2024-11-21 Multi-Agent Environments for Vehicle Routing Problems Ricardo Gama et.al. 2411.14411 link
2024-11-21 Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation Yuanhao Cai et.al. 2411.14384 null
2024-11-21 CoNFiLD-inlet: Synthetic Turbulence Inflow Using Generative Latent Diffusion Models with Neural Fields Xin-Yang Liu et.al. 2411.14378 null
2024-11-21 Enhancing Medical Image Segmentation with Deep Learning and Diffusion Models Houze Liu et.al. 2411.14353 null
2024-11-21 DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding Tianhe Ren et.al. 2411.14347 link
2024-11-21 Lower Dimensional Spherical Representation of Medium Voltage Load Profiles for Visualization, Outlier Detection, and Generative Modelling Edgar Mauricio Salazar Duque et.al. 2411.14346 null
2024-11-21 StereoCrafter-Zero: Zero-Shot Stereo Video Generation with Noisy Restart Jian Shi et.al. 2411.14295 null
2024-11-21 Efficient Aspect-Based Summarization of Climate Change Reports with Small Language Models Iacopo Ghinassi et.al. 2411.14272 link
2024-11-21 Guided MRI Reconstruction via Schrödinger Bridge Yue Wang et.al. 2411.14269 null
2024-11-21 Regional Attention for Shadow Removal Hengxing Liu et.al. 2411.14201 link
2024-11-21 TaQ-DiT: Time-aware Quantization for Diffusion Transformers Xinyan Liu et.al. 2411.14172 null
2024-11-21 Creating a Formally Verified Neural Network for Autonomous Navigation: An Experience Report Syed Ali Asadullah Bukhari et.al. 2411.14163 link
2024-11-20 REDUCIO! Generating 1024 $\times$ 1024 Video within 16 Seconds using Extremely Compressed Motion Latents Rui Tian et.al. 2411.13552 link
2024-11-20 Identity Preserving 3D Head Stylization with Multiview Score Distillation Bahri Batuhan Bilecen et.al. 2411.13536 null
2024-11-20 VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models Ziqi Huang et.al. 2411.13503 link
2024-11-20 LIMBA: An Open-Source Framework for the Preservation and Valorization of Low-Resource Languages using Generative Models Salvatore Mario Carta et.al. 2411.13453 null
2024-11-20 Heuristically Adaptive Diffusion-Model Evolutionary Strategy Benedikt Hartl et.al. 2411.13420 null
2024-11-20 Energy-based generative models for monoclonal antibodies Paul Pereira et.al. 2411.13390 link
2024-11-20 Small and Close-In Planets are Uncommon around A-type Stars Steven Giacalone et.al. 2411.13363 null
2024-11-20 Vertical Validation: Evaluating Implicit Generative Models for Graphs on Thin Support Regions Mai Elkady et.al. 2411.13358 null
2024-11-20 A CSI Feedback Framework based on Transmitting the Important Values and Generating the Others Zhilin Du et.al. 2411.13298 null
2024-11-21 Structure-Based Molecule Optimization via Gradient-Guided Bayesian Update Keyue Qiu et.al. 2411.13280 null
2024-11-20 XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation Ziyi Wang et.al. 2411.13243 link
2024-11-20 BIPro: Zero-shot Chinese Poem Generation via Block Inverse Prompting Constrained Generation Framework Xu Zou et.al. 2411.13237 null
2024-11-20 Building music with Lego bricks and Raspberry Pi Ana M. Barbancho et.al. 2411.13224 null
2024-11-20 A computational framework for integrating Predictive processes with evidence Accumulation Models (PAM) Antonino Visalli et.al. 2411.13203 link
2024-11-20 OpenMS WebApps: Building User-Friendly Solutions for MS Analysis Tom David Müller et.al. 2411.13189 null
2024-11-19 Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs Ahmed Akib Jawad Karim et.al. 2411.12712 null
2024-11-19 OrigamiPlot: An R Package and Shiny Web App Enhanced Visualizations for Multivariate Data Yiwen Lu et.al. 2411.12674 null
2024-11-19 Auto-Evaluation with Few Labels through Post-hoc Regression Benjamin Eyre et.al. 2411.12665 null
2024-11-19 PoM: Efficient Image and Video Generation with the Polynomial Mixer David Picard et.al. 2411.12663 link
2024-11-19 Optimizing Airline Reservation Systems with Edge-Enabled Microservices: A Framework for Real-Time Data Processing and Enhanced User Responsiveness Biman Barua et.al. 2411.12650 null
2024-11-19 DLBacktrace: A Model Agnostic Explainability for any Deep Learning Models Vinay Kumar Sankarapu et.al. 2411.12643 link
2024-11-19 Improving Controllability and Editability for Pretrained Text-to-Music Generation Models Yixiao Zhang et.al. 2411.12641 null
2024-11-19 Universal programmable waveguide arrays Akram Youssry et.al. 2411.12610 null
2024-11-19 Whisper Finetuning on Nepali Language Sanjay Rijal et.al. 2411.12587 null
2024-11-19 Predicting Customer Satisfaction by Replicating the Survey Response Distribution Etienne Manderscheid et.al. 2411.12539 null
2024-11-19 Data Pruning in Generative Diffusion Models Rania Briq et.al. 2411.12523 null
2024-11-19 Probe-Me-Not: Protecting Pre-trained Encoders from Malicious Probing Ruyi Ding et.al. 2411.12508 null
2024-11-19 Empirical Privacy Evaluations of Generative and Predictive Machine Learning Models -- A review and challenges for practice Flavio Hafner et.al. 2411.12451 null
2024-11-19 Frequency-Aware Guidance for Blind Image Restoration via Diffusion Models Jun Xiao et.al. 2411.12450 null
2024-11-19 A general modeling and simulation framework for dynamic vehicle routing Markó Horváth et.al. 2411.12406 link
2024-11-18 QARM: Quantitative Alignment Multi-Modal Recommendation at Kuaishou Xinchen Luo et.al. 2411.11739 null
2024-11-18 Aligning Few-Step Diffusion Models with Dense Reward Difference Learning Ziyi Zhang et.al. 2411.11727 link
2024-11-18 Multiscale nonlinear integration drives accurate encoding of input information Giorgio Nicoletti et.al. 2411.11710 null
2024-11-18 Robust Reinforcement Learning under Diffusion Models for Data with Jumps Chenyang Jiang et.al. 2411.11697 null
2024-11-18 Active droplets controlled by enzymatic reactions Jacques Fries et.al. 2411.11696 null
2024-11-18 Do Captioning Metrics Reflect Music Semantic Alignment? Jinwoo Lee et.al. 2411.11692 null
2024-11-18 Conceptwm: A Diffusion Model Watermark for Concept Protection Liangqi Lei et.al. 2411.11688 null
2024-11-19 GNN-Based Code Annotation Logic for Establishing Security Boundaries in C Code Varun Gadey et.al. 2411.11567 null
2024-11-19 Cascaded Diffusion Models for 2D and 3D Microscopy Image Synthesis to Enhance Cell Segmentation Rüveyda Yilmaz et.al. 2411.11515 null
2024-11-18 Collaborative Contrastive Network for Click-Through Rate Prediction Chen Gao et.al. 2411.11508 null
2024-11-18 LaVin-DiT: Large Vision Diffusion Transformer Zhaoqing Wang et.al. 2411.11505 null
2024-11-18 Alien Recombination: Exploring Concept Blends Beyond Human Cognitive Availability in Visual Art Alejandro Hernandez et.al. 2411.11494 null
2024-11-18 MVLight: Relightable Text-to-3D Generation via Light-conditioned Multi-View Diffusion Dongseok Shim et.al. 2411.11475 null
2024-11-18 GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts Junwen He et.al. 2411.11435 null
2024-11-18 CLUE-MARK: Watermarking Diffusion Models using CLWE Kareem Shehata et.al. 2411.11434 null
2024-11-15 M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation Sucheng Ren et.al. 2411.10433 link
2024-11-15 Mitigating Parameter Degeneracy using Joint Conditional Diffusion Model for WECC Composite Load Model in Power Systems Feiqin Zhu et.al. 2411.10431 null
2024-11-15 Multiscale Dubuc: A New Similarity Measure for Time Series Mahsa Khazaei et.al. 2411.10418 link
2024-11-15 Experimental generation of extreme electron beams for advanced accelerator applications Claudio Emma et.al. 2411.10413 null
2024-11-15 How to Build a Quantum Supercomputer: Scaling Challenges and Opportunities Masoud Mohseni et.al. 2411.10406 null
2024-11-15 Nonlinearity-Driven Morphing and Control of Topological Modes in Non-Hermitian Systems Zhao-Fan Cai et.al. 2411.10398 null
2024-11-15 Towards High-Fidelity 3D Portrait Generation with Rich Details by Cross-View Prior-Aware Diffusion Haoran Wei et.al. 2411.10369 null
2024-11-15 Safe Text-to-Image Generation: Simply Sanitize the Prompt Embedding Huming Qiu et.al. 2411.10329 null
2024-11-15 Probabilistic Prior Driven Attention Mechanism Based on Diffusion Model for Imaging Through Atmospheric Turbulence Guodong Sun et.al. 2411.10321 null
2024-11-15 Assortment Optimization under the Multinomial Logit Model with Covering Constraints Omar El Housni et.al. 2411.10310 null
2024-11-15 Modification Takes Courage: Seamless Image Stitching via Reference-Driven Inpainting Ziqi Xie et.al. 2411.10309 link
2024-11-15 MDHP-Net: Detecting Injection Attacks on In-vehicle Network using Multi-Dimensional Hawkes Process and Temporal Model Qi Liu et.al. 2411.10258 null
2024-11-15 The Unreasonable Effectiveness of Guidance for Diffusion Models Tim Kaiser et.al. 2411.10257 null
2024-11-15 Smooth transport map via diffusion process Arthur Stéphanovitch et.al. 2411.10235 null
2024-11-15 ColorEdit: Training-free Image-Guided Color editing with diffusion model Xingxi Yin et.al. 2411.10232 null
2024-11-14 A Bayesian Optimization Approach to Machine Translation Reranking Julius Cheng et.al. 2411.09694 null
2024-11-14 SimTube: Generating Simulated Video Comments through Multimodal AI and User Personas Yu-Kai Hung et.al. 2411.09577 null
2024-11-14 Golden Noise for Diffusion Models: A Learning Framework Zikai Zhou et.al. 2411.09502 null
2024-11-14 Sparse Bayesian Generative Modeling for Compressive Sensing Benedikt Böck et.al. 2411.09483 link
2024-11-14 DiffRoad: Realistic and Diverse Road Scenario Generation for Autonomous Vehicle Testing Junjie Zhou et.al. 2411.09451 null
2024-11-14 Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models Chutian Meng et.al. 2411.09449 null
2024-11-14 A survey of probabilistic generative frameworks for molecular simulations Richard John et.al. 2411.09388 link
2024-11-14 Multi-scale Generative Modeling for Fast Sampling Xiongye Xiao et.al. 2411.09356 null
2024-11-14 ParaLBench: A Large-Scale Benchmark for Computational Paralinguistics over Acoustic Foundation Models Zixing Zhang et.al. 2411.09349 null
2024-11-15 Approximate Probabilistic Inference for Time-Series Data A Robust Latent Gaussian Model With Temporal Awareness Anton Johansson et.al. 2411.09312 null
2024-11-14 EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models Soowon Kim et.al. 2411.09302 null
2024-11-14 LES-Talker: Fine-Grained Emotion Editing for Talking Head Generation in Linear Emotion Space Guanwen Feng et.al. 2411.09268 null
2024-11-14 Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey Xuannan Liu et.al. 2411.09259 link
2024-11-14 RibCageImp: A Deep Learning Framework for 3D Ribcage Implant Generation Gyanendra Chaubey et.al. 2411.09204 null
2024-11-14 Improvement and Implementation of a Speech Emotion Recognition Model Based on Dual-Layer LSTM Xiaoran Yang et.al. 2411.09189 null
2024-11-13 4D Gaussian Splatting in the Wild with Uncertainty-Aware Regularization Mijeong Kim et.al. 2411.08879 null
2024-11-13 A generalized software framework for consolidation of radiotherapy planning and delivery data from diverse data sources Yasin Abdulkadir et.al. 2411.08876 null
2024-11-13 Offline Adaptation of Quadruped Locomotion using Diffusion Models Reece O'Mahoney et.al. 2411.08832 null
2024-11-13 SANDWICH: Towards an Offline, Differentiable, Fully-Trainable Wireless Neural Ray-Tracing Surrogate Yifei Jin et.al. 2411.08767 null
2024-11-13 Analyst Reports and Stock Performance: Evidence from the Chinese Market Rui Liu et.al. 2411.08726 null
2024-11-14 Reducing ADC Front-end Costs During Training of On-sensor Printed Multilayer Perceptrons Florentia Afentaki et.al. 2411.08674 null
2024-11-13 Joint Model Caching and Resource Allocation in Generative AI-Enabled Wireless Edge Networks Zhang Liu et.al. 2411.08672 null
2024-11-13 Toward Human Understanding with Controllable Synthesis Hanz Cuevas-Velasquez et.al. 2411.08663 null
2024-11-13 The Galactica database: an open, generic and versatile tool for the dissemination of simulation data in astrophysics Damien Chapon et.al. 2411.08647 null
2024-11-13 Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models Chengdong Dong et.al. 2411.08642 null
2024-11-13 Deep Generative Demand Learning for Newsvendor and Pricing Shijin Gong et.al. 2411.08631 null
2024-11-13 LG-Gaze: Learning Geometry-aware Continuous Prompts for Language-Guided Gaze Estimation Pengwei Yin et.al. 2411.08606 null
2024-11-13 CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs Suhas S Kowshik et.al. 2411.08553 null
2024-11-13 Explainers' Mental Representations of Explainees' Needs in Everyday Explanations Michael Erol Schaffer et.al. 2411.08514 null
2024-11-13 HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere Hatef Otroshi Shahreza et.al. 2411.08470 null
2024-11-12 Scaling Properties of Diffusion Models for Perceptual Tasks Rahul Ravishankar et.al. 2411.08034 null
2024-11-12 GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation Yushi Lan et.al. 2411.08033 null
2024-11-12 Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings Aditya Sanghi et.al. 2411.08017 link
2024-11-12 JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation Yiyang Ma et.al. 2411.07975 link
2024-11-12 Diverse capability and scaling of diffusion and auto-regressive models when learning abstract rules Binxu Wang et.al. 2411.07873 null
2024-11-12 Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders Xiaofeng Zhu et.al. 2411.07870 null
2024-11-12 CDXFormer: Boosting Remote Sensing Change Detection with Extended Long Short-Term Memory Zhenkai Wu et.al. 2411.07863 link
2024-11-12 Sparsity-Aware Optimization of In-Memory Bayesian Binary Neural Network Accelerators Prabodh Katti et.al. 2411.07842 null
2024-11-12 Novel View Synthesis with Pixel-Space Diffusion Models Noam Elata et.al. 2411.07765 null
2024-11-12 Nanosecond nanothermometry in an electron microscope Florian Castioni et.al. 2411.07764 null
2024-11-12 LapGSR: Laplacian Reconstructive Network for Guided Thermal Super-Resolution Aditya Kasliwal et.al. 2411.07750 null
2024-11-12 The relationship between general equilibrium models with infinite-lived agents and overlapping generations models, and some applications Ngoc-Sang Pham et.al. 2411.07674 null
2024-11-12 Evaluating the Generation of Spatial Relations in Text and Image Generative Models Shang Hong Sim et.al. 2411.07664 null
2024-11-12 Leveraging Previous Steps: A Training-free Fast Solver for Flow Diffusion Kaiyu Song et.al. 2411.07627 null
2024-11-12 Unraveling the Connections between Flow Matching and Diffusion Probabilistic Models in Training-free Conditional Generation Kaiyu Song et.al. 2411.07625 null
2024-11-11 Score-based generative diffusion with "active" correlated noise sources Alexandra Lamtyugina et.al. 2411.07233 null
2024-11-12 Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models Yoad Tewel et.al. 2411.07232 null
2024-11-11 Learning from Limited and Imperfect Data Harsh Rangwani et.al. 2411.07229 null
2024-11-11 TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models Matheus Simão et.al. 2411.07224 null
2024-11-11 DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID Nyle Siddiqui et.al. 2411.07205 link
2024-11-11 Crossover from inhomogeneous to homogeneous response of a resonantly driven hBN quantum emitter Domitille Gérard et.al. 2411.07202 null
2024-11-11 OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision Cong Wei et.al. 2411.07199 null
2024-11-11 More Expressive Attention with Negative Weights Ang Lv et.al. 2411.07176 link
2024-11-11 Edify 3D: Scalable High-Quality 3D Asset Generation NVIDIA et.al. 2411.07135 null
2024-11-11 Benchmarking LLMs' Judgments with No Gold Standard Shengwei Xu et.al. 2411.07127 link
2024-11-11 Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models NVIDIA et.al. 2411.07126 null
2024-11-11 Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models Yanchen Wang et.al. 2411.07121 link
2024-11-11 Scaling Mesh Generation via Compressive Tokenization Haohan Weng et.al. 2411.07025 link
2024-11-11 An Electrocardiogram Monitoring Device Based on STM32 Wenqi Guan et.al. 2411.06962 null
2024-11-11 Generative Feature Training of Thin 2-Layer Networks Johannes Hertrich et.al. 2411.06848 link
2024-11-08 StdGEN: Semantic-Decomposed 3D Character Generation from Single Images Yuze He et.al. 2411.05738 null
2024-11-08 Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models Jia-Hong Huang et.al. 2411.05706 null
2024-11-08 Improving Molecular Graph Generation with Flow Matching and Optimal Transport Xiaoyang Hou et.al. 2411.05676 null
2024-11-08 Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion Nan Song et.al. 2411.05544 null
2024-11-08 Improving image synthesis with diffusion-negative sampling Alakh Desai et.al. 2411.05473 null
2024-11-08 Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation Peidong Liu et.al. 2411.05472 link
2024-11-08 IntellBot: Retrieval Augmented LLM Chatbot for Cyber Threat Knowledge Delivery Dincy R. Arikkat et.al. 2411.05442 null
2024-11-08 RED: Residual Estimation Diffusion for Low-Dose PET Sinogram Reconstruction Xingyu Ai et.al. 2411.05354 null
2024-11-08 Electro-diffusive modeling and the role of spine geometry on action potential propagation in neurons Rahul Gulati et.al. 2411.05329 null
2024-11-08 Social balance in directed networks Bingjie Hao et.al. 2411.05327 null
2024-11-08 SeqRFM: Fast RFM Analysis in Sequence Data Yanxin Zheng et.al. 2411.05317 link
2024-11-08 Differentiable Calibration of Inexact Stochastic Simulation Models via Kernel Score Minimization Ziwei Su et.al. 2411.05315 null
2024-11-08 A Real-time Face Mask Detection and Social Distancing System for COVID-19 using Attention-InceptionV3 Model Abdullah Al Asif et.al. 2411.05312 null
2024-11-08 Adaptive Whole-Body PET Image Denoising Using 3D Diffusion Models with ControlNet Boxiao Yu et.al. 2411.05302 null
2024-11-08 GPT Semantic Cache: Reducing LLM Costs and Latency via Semantic Embedding Caching Sajal Regmi et.al. 2411.05276 null
2024-11-07 SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models Muyang Li et.al. 2411.05007 link
2024-11-07 ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing Jun-Kun Chen et.al. 2411.05006 null
2024-11-07 Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models Shuhong Zheng et.al. 2411.05005 null
2024-11-07 ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning David Junhao Zhang et.al. 2411.05003 null
2024-11-07 SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation Koichi Namekata et.al. 2411.04989 null
2024-11-07 Few-Shot Task Learning through Inverse Generative Modeling Aviv Netanyahu et.al. 2411.04987 null
2024-11-07 How fast does the WallGo? A package for computing wall velocities in first-order phase transitions Andreas Ekstedt et.al. 2411.04970 link
2024-11-07 VAIR: Visuo-Acoustic Implicit Representations for Low-Cost, Multi-Modal Transparent Surface Reconstruction in Indoor Scenes Advaith V. Sethuraman et.al. 2411.04963 null
2024-11-07 Uncovering Hidden Subspaces in Video Diffusion Models Using Re-Identification Mischa Dombrowski et.al. 2411.04956 null
2024-11-07 Fed-LDR: Federated Local Data-infused Graph Creation with Node-centric Model Refinement Jiechao Gao et.al. 2411.04936 null
2024-11-07 DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion Wenqiang Sun et.al. 2411.04928 null
2024-11-07 StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration Panwen Hu et.al. 2411.04925 null
2024-11-07 Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion Kaizhe Hu et.al. 2411.04919 link
2024-11-07 GASE: Generatively Augmented Sentence Encoding Manuel Frank et.al. 2411.04914 null
2024-11-07 Controlling Human Shape and Pose in Text-to-Image Diffusion Models via Domain Adaptation Benito Buchheim et.al. 2411.04724 null
2024-11-06 Community Forensics: Using Thousands of Generators to Train Fake Image Detectors Jeongsoo Park et.al. 2411.04125 null
2024-11-06 Stepping Forward on the Last Mile Chen Feng et.al. 2411.04036 null
2024-11-06 Prototyping O-RAN Enabled UAV Experimentation for the AERPAW Testbed Joshua Moore et.al. 2411.04027 null
2024-11-06 Object-Centric Dexterous Manipulation from Human Motion Data Yuanpei Chen et.al. 2411.04005 null
2024-11-06 Synomaly Noise and Multi-Stage Diffusion: A Novel Approach for Unsupervised Anomaly Detection in Ultrasound Imaging Yuan Bi et.al. 2411.04004 null
2024-11-06 ET-SEED: Efficient Trajectory-Level SE(3) Equivariant Diffusion Policy Chenrui Tie et.al. 2411.03990 null
2024-11-06 ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models Ashutosh Srivastava et.al. 2411.03982 null
2024-11-06 Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning Jiawei Yao et.al. 2411.03978 link
2024-11-06 Bayesian algorithmic perfumery: A Hierarchical Relevance Vector Machine for the Estimation of Personalized Fragrance Preferences based on Three Sensory Layers and Jungian Personality Archetypes Rolando Gonzales Martinez et.al. 2411.03965 null
2024-11-06 Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks Felipe Marra et.al. 2411.03948 link
2024-11-06 Can Custom Models Learn In-Context? An Exploration of Hybrid Architecture Performance on In-Context Learning Tasks Ryan Campbell et.al. 2411.03945 link
2024-11-06 GUIDE-VAE: Advancing Data Generation with User Information and Pattern Dictionaries Kutay Bölat et.al. 2411.03936 null
2024-11-06 Large Generative Model-assisted Talking-face Semantic Communication System Feibo Jiang et.al. 2411.03876 null
2024-11-06 ROBIN: Robust and Invisible Watermarks for Diffusion Models with Adversarial Optimization Huayang Huang et.al. 2411.03862 link
2024-11-06 Sub-DM:Subspace Diffusion Model with Orthogonal Decomposition for MRI Reconstruction Yu Guan et.al. 2411.03758 null
2024-11-05 MME-Finance: A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning Ziliang Gan et.al. 2411.03314 null
2024-11-05 LLMs for Domain Generation Algorithm Detection Reynier Leyva La O et.al. 2411.03307 null
2024-11-05 DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models Ying Zhou et.al. 2411.03250 null
2024-11-05 On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models Tariq Berrada Ifriqi et.al. 2411.03177 null
2024-11-05 Unleashing the power of novel conditional generative approaches for new materials discovery Lev Novitskiy et.al. 2411.03156 link
2024-11-05 Local Lesion Generation is Effective for Capsule Endoscopy Image Data Augmentation in a Limited Data Setting Adrian B. Chłopowiec et.al. 2411.03098 null
2024-11-05 Gradient-Guided Conditional Diffusion Models for Private Image Reconstruction: Analyzing Adversarial Impacts of Differential Privacy and Denoising Tao Huang et.al. 2411.03053 null
2024-11-05 GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details Zhongjin Luo et.al. 2411.03047 null
2024-11-05 Speaker Emotion Recognition: Leveraging Self-Supervised Models for Feature Extraction Using Wav2Vec2 and HuBERT Pourya Jafarzadeh et.al. 2411.02964 null
2024-11-05 IMUDiffusion: A Diffusion Model for Multivariate Time Series Synthetisation for Inertial Motion Capturing Systems Heiko Oppel et.al. 2411.02954 null
2024-11-05 LDPM: Towards undersampled MRI reconstruction with MR-VAE and Latent Diffusion Prior Xingjian Tang et.al. 2411.02951 null
2024-11-05 A scalable generative model for dynamical system reconstruction from neuroimaging data Eric Volkmann et.al. 2411.02949 link
2024-11-05 Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey Ao Fu et.al. 2411.02914 null
2024-11-05 The Unreasonable Effectiveness of LLMs for Query Optimization Peter Akioyamen et.al. 2411.02862 link
2024-11-05 ADOPT: Modified Adam Can Converge with Any $β_2$ with the Optimal Rate Shohei Taniguchi et.al. 2411.02853 link
2024-11-04 Training-free Regional Prompting for Diffusion Transformers Anthony Chen et.al. 2411.02395 link
2024-11-04 How Far is Video Generation from World Model: A Physical Law Perspective Bingyi Kang et.al. 2411.02385 null
2024-11-04 Virgo Filaments IV: Using WISE to Measure the Modification of Star-Forming Disks in the Extended Regions Around the Virgo Cluster Kim Conger et.al. 2411.02352 null
2024-11-04 Diffusion-based Generative Multicasting with Intent-aware Semantic Decomposition Xinkai Liu et.al. 2411.02334 null
2024-11-05 PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance Ruyang Liu et.al. 2411.02327 link
2024-11-04 LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation Mufei Li et.al. 2411.02322 link
2024-11-04 CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments Kung-Hsiang Huang et.al. 2411.02305 link
2024-11-04 Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation Xianghui Yang et.al. 2411.02293 null
2024-11-04 Counterfactual Explanations via Riemannian Latent Space Traversal Paraskevas Pegios et.al. 2411.02259 null
2024-11-04 FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training Ruihong Yin et.al. 2411.02229 null
2024-11-04 Recursive Learning of Asymptotic Variational Objectives Alessandro Mastrototaro et.al. 2411.02217 null
2024-11-04 Digi2Real: Bridging the Realism Gap in Synthetic Data Face Recognition via Foundation Models Anjith George et.al. 2411.02188 null
2024-11-04 Touch-to-Touch Translation -- Learning the Mapping Between Heterogeneous Tactile Sensing Technologies Francesco Grella et.al. 2411.02187 null
2024-11-04 CleAR: Robust Context-Guided Generative Lighting Estimation for Mobile Augmented Reality Yiqin Zhao et.al. 2411.02179 null
2024-11-04 CryptoEL: A Novel Experiential Learning Tool for Enhancing K-12 Cryptography Education Pranathi Rayavaram et.al. 2411.02143 null
2024-10-31 Bridging Geometric States via Geometric Diffusion Bridge Shengjie Luo et.al. 2410.24220 null
2024-10-31 Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning Penghui Ruan et.al. 2410.24219 link
2024-10-31 DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion Weicai Ye et.al. 2410.24203 link
2024-10-31 Multi-Attribute Linguistic Tuning for Controlled Paraphrase Generation Mohamed Elgaar et.al. 2410.24199 null
2024-10-31 Generative modelling for mass-mapping with fast uncertainty quantification Jessica J. Whitney et.al. 2410.24197 link
2024-10-31 AR-Pro: Counterfactual Explanations for Anomaly Repair with Formal Properties Xiayan Ji et.al. 2410.24178 null
2024-10-31 Redefining in Dictionary: Towards a Enhanced Semantic Understanding of Creative Generation Fu Feng et.al. 2410.24160 null
2024-10-31 Scaling Concept With Text-Guided Diffusion Models Chao Huang et.al. 2410.24151 null
2024-10-31 Repository-Level Compositional Code Translation and Validation Ali Reza Ibrahimzada et.al. 2410.24117 link
2024-10-31 Extended electrochemical monitoring of biomolecular binding using commercially available, reusable electrodes in microliter volumes Jeremy Mendez et.al. 2410.24110 null
2024-10-31 Sparsh: Self-supervised touch representations for vision-based tactile sensing Carolina Higuera et.al. 2410.24090 null
2024-10-31 Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure Xiang Li et.al. 2410.24060 link
2024-10-31 TPC: Test-time Procrustes Calibration for Diffusion-based Human Image Animation Sunjae Yoon et.al. 2410.24037 null
2024-10-31 Unveiling Synthetic Faces: How Synthetic Datasets Can Expose Real Identities Hatef Otroshi Shahreza et.al. 2410.24015 null
2024-10-31 DiffPAD: Denoising Diffusion-based Adversarial Patch Decontamination Jia Fu et.al. 2410.24006 link
2024-10-30 ReferEverything: Towards Segmenting Everything We Can Speak of in Videos Anurag Bagchi et.al. 2410.23287 null
2024-10-30 Provable acceleration for diffusion models under minimal assumptions Gen Li et.al. 2410.23285 null
2024-10-30 RelationBooth: Towards Relation-Aware Customized Object Generation Qingyu Shi et.al. 2410.23280 null
2024-10-30 SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation Yining Hong et.al. 2410.23277 null
2024-10-30 Multi-student Diffusion Distillation for Better One-step Generators Yanke Song et.al. 2410.23274 null
2024-10-30 ReaWristic: Remote Touch Sensation to Fingers from a Wristband via Visually Augmented Electro-Tactile Feedback Yudai Tanaka et.al. 2410.23193 null
2024-10-30 Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning Keqin Bao et.al. 2410.23136 link
2024-10-30 Educating for Hardware Specialization in the Chiplet Era: A Path for the HPC Community Kazutomo Yoshii et.al. 2410.23127 null
2024-10-30 CausalDiff: Causality-Inspired Disentanglement via Diffusion Model for Adversarial Defense Mingkun Zhang et.al. 2410.23091 link
2024-10-30 General Bayesian quantile regression for counts via generative modeling Yuta Yamauchi et.al. 2410.23081 null
2024-10-30 Controlling Language and Diffusion Models by Transporting Activations Pau Rodriguez et.al. 2410.23054 link
2024-10-30 Dispersion kinks from electronic correlations in an unconventional iron-based superconductor Ming-Hua Chang et.al. 2410.23044 null
2024-10-30 Improving Musical Accompaniment Co-creation via Diffusion Transformers Javier Nistal et.al. 2410.23005 null
2024-10-30 DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes Jialiang Zhang et.al. 2410.23004 null
2024-10-30 LumiSculpt: A Consistency Lighting Control Network for Video Generation Yuxin Zhang et.al. 2410.22979 null
2024-10-29 CaStL: Constraints as Specifications through LLM Translation for Long-Horizon Task and Motion Planning Weihang Guo et.al. 2410.22225 null
2024-10-29 A Gaussian Process Generative Model for QCD Equation of State Jiaxuan Gong et.al. 2410.22160 null
2024-10-29 Capacity Control is an Effective Memorization Mitigation Mechanism in Text-Conditional Diffusion Models Raman Dutt et.al. 2410.22149 link
2024-10-29 AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer Attempts Vishal Kumar et.al. 2410.22143 null
2024-10-29 Infrared photometry with InGaAs detectors: First light with SPECULOOS Peter P. Pedersen et.al. 2410.22140 link
2024-10-29 SimRec: Mitigating the Cold-Start Problem in Sequential Recommendation by Integrating Item Similarity Shaked Brody et.al. 2410.22136 link
2024-10-29 Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench Zheyuan Liu et.al. 2410.22108 link
2024-10-29 Variational inference for pile-up removal at hadron colliders with diffusion models Malte Algren et.al. 2410.22074 null
2024-10-29 PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement Shutong Jin et.al. 2410.22059 null
2024-10-29 Dual Conditional Diffusion Models for Sequential Recommendation Hongtao Huang et.al. 2410.21967 null
2024-10-29 PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference Kendong Liu et.al. 2410.21966 null
2024-10-29 CT to PET Translation: A Large-scale Dataset and Domain-Knowledge-Guided Diffusion Approach Dac Thai Nguyen et.al. 2410.21932 link
2024-10-29 Guided Diffusion-based Counterfactual Augmentation for Robust Session-based Recommendation Muskan Gupta et.al. 2410.21892 null
2024-10-29 On the study of the limit cycles for a class of population models with time-varying factors Renhao Tian et.al. 2410.21848 null
2024-10-29 Diffusion as Reasoning: Enhancing Object Goal Navigation with LLM-Biased Diffusion Model Yiming Ji et.al. 2410.21842 null
2024-10-28 On Inductive Biases That Enable Generalization of Diffusion Transformers Jie An et.al. 2410.21273 link
2024-10-28 EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation Shih-Yang Liu et.al. 2410.21271 null
2024-10-28 LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior Hanyu Wang et.al. 2410.21264 null
2024-10-28 One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation Zhendong Wang et.al. 2410.21257 null
2024-10-28 On learning higher-order cumulants in diffusion models Gert Aarts et.al. 2410.21212 null
2024-10-28 The VSPEC Collection: A suite of utilities to model spectroscopic phase curves of 3D exoplanet atmospheres in the presence of stellar variability Ted M Johnson et.al. 2410.21190 null
2024-10-28 Trajectory Flow Matching with Applications to Clinical Time Series Modeling Xi Zhang et.al. 2410.21154 link
2024-10-28 Synthetica: Large Scale Synthetic Data for Robot Perception Ritvik Singh et.al. 2410.21153 null
2024-10-28 Extrapolating Prospective Glaucoma Fundus Images through Diffusion Model in Irregular Longitudinal Sequences Zhihao Zhao et.al. 2410.21130 null
2024-10-28 Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models Wenda Li et.al. 2410.21088 link
2024-10-28 Federated Time Series Generation on Feature and Temporally Misaligned Data Chenrui Fan et.al. 2410.21072 null
2024-10-28 Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative Framework Vladimir Arkhipkin et.al. 2410.21061 link
2024-10-28 Beyond Autoregression: Fast LLMs via Self-Distillation Through Time Justin Deschenaux et.al. 2410.21035 link
2024-10-29 EEG-Driven 3D Object Reconstruction with Color Consistency and Diffusion Prior Xin Xiang et.al. 2410.20981 null
2024-10-28 MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis Di Qiu et.al. 2410.20974 null
2024-10-25 Model merging with SVD to tie the Knots George Stoica et.al. 2410.19735 link
2024-10-25 Adversarial Environment Design via Regret-Guided Diffusion Models Hojun Chung et.al. 2410.19715 null
2024-10-25 Perception, Control and Hardware for In-Hand Slip-Aware Object Manipulation with Parallel Grippers Gabriel Arslan Waltersson et.al. 2410.19660 null
2024-10-25 DiffGS: Functional Gaussian Splatting Diffusion Junsheng Zhou et.al. 2410.19657 null
2024-10-25 VARS: Vision-based Assessment of Risk in Security Systems Pranav Gupta et.al. 2410.19642 null
2024-10-25 Diffusion models for lattice gauge field simulations Qianteng Zhu et.al. 2410.19602 null
2024-10-25 Energy Efficient Dual Designs of FeFET-Based Analog In-Memory Computing with Inherent Shift-Add Capability Zeyu Yang et.al. 2410.19593 null
2024-10-25 Hybrid Memetic Search for Electric Vehicle Routing with Time Windows, Simultaneous Pickup-Delivery, and Partial Recharges Zubin Zheng et.al. 2410.19580 null
2024-10-25 Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series Ilan Naiman et.al. 2410.19538 null
2024-10-25 Ensemble Data Assimilation for Particle-based Methods Marius Duvillard et.al. 2410.19525 null
2024-10-25 Marked Temporal Bayesian Flow Point Processes Hui Chen et.al. 2410.19512 null
2024-10-25 EDGE: Enhanced Grounded GUI Understanding with Enriched Multi-Granularity Synthetic Data Xuetian Chen et.al. 2410.19461 null
2024-10-28 NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction Zixuan Gong et.al. 2410.19452 link
2024-10-25 Learned Reference-based Diffusion Sampling for multi-modal distributions Maxence Noble et.al. 2410.19449 null
2024-10-25 Generative Diffusion Models for Sequential Recommendations Sharare Zolghadr et.al. 2410.19429 null
2024-10-24 Framer: Interactive Frame Interpolation Wen Wang et.al. 2410.18978 null
2024-10-24 MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms Ling-Hao Chen et.al. 2410.18977 null
2024-10-24 Unbounded: A Generative Infinite Game of Character Life Simulation Jialu Li et.al. 2410.18975 null
2024-10-24 3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation Hansheng Chen et.al. 2410.18974 link
2024-10-24 On the Crucial Role of Initialization for Matrix Factorization Bingcong Li et.al. 2410.18965 null
2024-10-24 Stable Consistency Tuning: Understanding and Improving Consistency Models Fu-Yun Wang et.al. 2410.18958 link
2024-10-24 Generation of synthetic financial time series by diffusion models Tomonori Takahashi et.al. 2410.18897 null
2024-10-24 Diff-Instruct++: Training One-step Text-to-image Generator Model to Align with Human Preferences Weijian Luo et.al. 2410.18881 null
2024-10-24 The Cat and Mouse Game: The Ongoing Arms Race Between Diffusion Models and Detection Methods Linda Laurier et.al. 2410.18866 null
2024-10-24 From Efficiency to Equity: Measuring Fairness in Preference Learning Shreeyash Gowaikar et.al. 2410.18841 null
2024-10-24 From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers for Underrepresented Languages Artur Kiulian et.al. 2410.18836 null
2024-10-24 Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation Xiaoyu Zhang et.al. 2410.18830 null
2024-10-24 Towards Visual Text Design Transfer Across Languages Yejin Choi et.al. 2410.18823 null
2024-10-24 Fast constrained sampling in pre-trained diffusion models Alexandros Graikos et.al. 2410.18804 null
2024-10-24 Large Generative AI Models meet Open Networks for 6G: Integration, Platform, and Monetization Peizheng Li et.al. 2410.18790 null
2024-10-23 DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes Hengwei Bian et.al. 2410.18084 null
2024-10-23 Prioritized Generative Replay Renhao Wang et.al. 2410.18082 null
2024-10-23 WorldSimBench: Towards Video Generation Models as World Simulators Yiran Qin et.al. 2410.18072 null
2024-10-23 TP-Eval: Tap Multimodal LLMs' Potential in Evaluation by Customizing Prompts Yuxuan Xie et.al. 2410.18071 null
2024-10-23 Training Free Guided Flow Matching with Optimal Control Luran Wang et.al. 2410.18070 null
2024-10-23 Spectrally shaped THz pulses from tapered dielectric waveguides Karel Peetermans et.al. 2410.17975 null
2024-10-23 Optical Generative Models Shiqi Chen et.al. 2410.17970 null
2024-10-23 A Wavelet Diffusion GAN for Image Super-Resolution Lorenzo Aloisi et.al. 2410.17966 null
2024-10-23 Addressing Asynchronicity in Clinical Multimodal Fusion via Individualized Chest X-ray Generation Wenfang Yao et.al. 2410.17918 link
2024-10-23 regAL: Python Package for Active Learning of Regression Problems Elizaveta Surzhikova et.al. 2410.17917 null
2024-10-23 Scaling Diffusion Language Models via Adaptation from Autoregressive Models Shansan Gong et.al. 2410.17891 link
2024-10-23 Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech Danilo de Oliveira et.al. 2410.17834 null
2024-10-23 PGDiffSeg: Prior-Guided Denoising Diffusion Model with Parameter-Shared Attention for Breast Cancer Segmentation Feiyan Feng et.al. 2410.17812 null
2024-10-23 GenUDC: High Quality 3D Mesh Generation with Unsigned Dual Contouring Representation Ruowei Wang et.al. 2410.17802 link
2024-10-23 Regularized autoregressive modeling and its application to audio signal declipping Ondřej Mokrý et.al. 2410.17790 link
2024-10-22 Large Language Models Empowered Personalized Web Agents Hongru Cai et.al. 2410.17236 null
2024-10-22 Creativity in AI: Progresses and Challenges Mete Ismayilzada et.al. 2410.17218 null
2024-10-22 Audio-to-Score Conversion Model Based on Whisper methodology Hongyao Zhang et.al. 2410.17209 null
2024-10-22 Reinforcement learning on structure-conditioned categorical diffusion for protein inverse folding Yasha Ektefaie et.al. 2410.17173 link
2024-10-22 Performance of the CMS high-level trigger during LHC Run 2 CMS Collaboration et.al. 2410.17038 null
2024-10-22 Hybrid Generative AI for De Novo Design of Co-Crystals with Enhanced Tabletability Nina Gubina et.al. 2410.17005 link
2024-10-22 DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization Haowei Zhu et.al. 2410.16942 null
2024-10-22 Hierarchical Clustering for Conditional Diffusion in Image Generation Jorge da Silva Goncalves et.al. 2410.16910 link
2024-10-22 Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections Marco Miani et.al. 2410.16901 null
2024-10-22 VistaDream: Sampling multiview consistent images for single-view scene reconstruction Haiping Wang et.al. 2410.16892 null
2024-10-22 CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare Nicholas I-Hsien Kuo et.al. 2410.16872 null
2024-10-22 MPDS: A Movie Posters Dataset for Image Generation with Diffusion Model Meng Xu et.al. 2410.16840 null
2024-10-22 Bridging Search and Recommendation in Generative Retrieval: Does One Task Help the Other? Gustavo Penha et.al. 2410.16823 null
2024-10-22 Evaluating the Effectiveness of Attack-Agnostic Features for Morphing Attack Detection Laurent Colbois et.al. 2410.16802 link
2024-10-22 One-Step Diffusion Distillation through Score Implicit Matching Weijian Luo et.al. 2410.16794 link
2024-10-21 MvDrag3D: Drag-based Creative 3D Editing via Multi-view Generation-Reconstruction Priors Honghua Chen et.al. 2410.16272 null
2024-10-21 Agent-to-Sim: Learning Interactive Behavior Models from Casual Longitudinal Videos Gengshan Yang et.al. 2410.16259 null
2024-10-21 Distribution Learning with Valid Outputs Beyond the Worst-Case Nick Rittler et.al. 2410.16253 null
2024-10-21 Building A Coding Assistant via the Retrieval-Augmented Language Model Xinze Li et.al. 2410.16229 link
2024-10-21 CiteClick: A Browser Extension for Real-Time Scholar Citation Tracking Nishat Raihan et.al. 2410.16211 null
2024-10-21 A Framework for Evaluating Predictive Models Using Synthetic Image Covariates and Longitudinal Data Simon Deltadahl et.al. 2410.16177 null
2024-10-22 Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models Giannis Daras et.al. 2410.16152 null
2024-10-21 Modelling Structured Data Learning with Restricted Boltzmann Machines in the Teacher-Student Setting Robin Thériault et.al. 2410.16150 null
2024-10-21 SeaDAG: Semi-autoregressive Diffusion for Conditional Directed Acyclic Graph Generation Xinyi Zhou et.al. 2410.16119 null
2024-10-21 Critical Example Mining for Vehicle Trajectory Prediction using Flow-based Generative Models Zhezhang Ding et.al. 2410.16083 null
2024-10-21 Continuous Speech Synthesis using per-token Latent Diffusion Arnon Turetzky et.al. 2410.16048 null
2024-10-21 Some generalizations of the convective model of jet generation S. N. Artekha et.al. 2410.16035 null
2024-10-21 ComPO: Community Preferences for Language Model Personalization Sachin Kumar et.al. 2410.16027 null
2024-10-21 Massimo: Public Queue Monitoring and Management using Mass-Spring Model Abhijeet Kumar et.al. 2410.16012 null
2024-10-21 AI-Driven Innovations in Modern Cloud Computing Animesh Kumar et.al. 2410.15960 null
2024-10-18 BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities Shaozhe Hao et.al. 2410.14672 link
2024-10-18 How Does Data Diversity Shape the Weight Landscape of Neural Networks? Yang Ba et.al. 2410.14602 null
2024-10-18 Bayesian Multi-wavelength Imaging of the LMC SN1987A with SRG/eROSITA Vincent Eberle et.al. 2410.14599 null
2024-10-18 Neuro-Symbolic Traders: Assessing the Wisdom of AI Crowds in Markets Namid R. Stillman et.al. 2410.14587 null
2024-10-18 Reimagining partial thickness keratoplasty: An eye mountable robot for autonomous big bubble needle insertion Y. Wang et.al. 2410.14577 null
2024-10-18 Multi-modal Pose Diffuser: A Multimodal Generative Conditional Pose Prior Calvin-Khang Ta et.al. 2410.14540 null
2024-10-18 Blockchain-Based Trust and Transparency in Airline Reservation Systems using Microservices Architecture Biman Barua et.al. 2410.14518 null
2024-10-18 LEAD: Latent Realignment for Human Motion Diffusion Nefeli Andreou et.al. 2410.14508 null
2024-10-18 Reinforcement Learning in Non-Markov Market-Making Luca Lalor et.al. 2410.14504 null
2024-10-18 Data-driven topology design with persistent homology for enhancing population diversity Taisei Kii et.al. 2410.14496 null
2024-10-18 ANT: Adaptive Noise Schedule for Time Series Diffusion Models Seunghan Lee et.al. 2410.14488 link
2024-10-21 CaTs and DAGs: Integrating Directed Acyclic Graphs with Transformers and Fully-Connected Neural Networks for Causally Constrained Predictions Matthew J. Vowels et.al. 2410.14485 link
2024-10-18 DRL Optimization Trajectory Generation via Wireless Network Intent-Guided Diffusion Models for Optimizing Resource Allocation Junjie Wu et.al. 2410.14481 null
2024-10-18 Flow-based Sampling for Entanglement Entropy and the Machine Learning of Defects Andrea Bulgarelli et.al. 2410.14466 null
2024-10-18 FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models Rui Hu et.al. 2410.14429 null
2024-10-17 Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens Lijie Fan et.al. 2410.13863 null
2024-10-17 Diffusing States and Matching Scores: A New Framework for Imitation Learning Runzhe Wu et.al. 2410.13855 link
2024-10-17 Influence Functions for Scalable Data Attribution in Diffusion Models Bruno Mlodozeniec et.al. 2410.13850 null
2024-10-17 VidPanos: Generative Panoramic Videos from Casual Panning Videos Jingwei Ma et.al. 2410.13832 null
2024-10-17 DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control Yujie Wei et.al. 2410.13830 null
2024-10-17 Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning Xiaodan Xing et.al. 2410.13823 link
2024-10-17 ConsisSR: Delving Deep into Consistency in Diffusion-based Image Super-Resolution Junhao Gu et.al. 2410.13807 null
2024-10-17 Probing the Latent Hierarchical Structure of Data via Diffusion Models Antonio Sclocchi et.al. 2410.13770 null
2024-10-17 Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers Yuchen Liang et.al. 2410.13746 null
2024-10-17 Improved Convergence Rate for Diffusion Probabilistic Models Gen Li et.al. 2410.13738 null
2024-10-17 Optimizing Probabilistic Conformal Prediction with Vectorized Non-Conformity Scores Minxing Zheng et.al. 2410.13735 null
2024-10-18 DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation Hanbo Cheng et.al. 2410.13726 link
2024-10-17 Movie Gen: A Cast of Media Foundation Models Adam Polyak et.al. 2410.13720 link
2024-10-18 Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion Yijun Liang et.al. 2410.13674 link
2024-10-17 Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design Chenyu Wang et.al. 2410.13643 link
2024-10-16 Geometry-Aware Generative Autoencoders for Warped Riemannian Metric Learning and Generative Modeling on Data Manifolds Xingzhi Sun et.al. 2410.12779 null
2024-10-16 Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts Hongcheng Gao et.al. 2410.12777 link
2024-10-16 SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation Jaehong Yoon et.al. 2410.12761 null
2024-10-16 Signature of Vertical Mixing in Hydrogen-dominated Exoplanet Atmospheres Vikas Soni et.al. 2410.12737 null
2024-10-16 Counterfactual Generative Modeling with Variational Causal Inference Yulun Wu et.al. 2410.12730 null
2024-10-16 FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression Zhenheng Tang et.al. 2410.12707 null
2024-10-16 Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization Xingqi Wang et.al. 2410.12700 link
2024-10-16 AdaptiveDrag: Semantic-Driven Dragging on Diffusion-Based Image Editing DuoSheng Chen et.al. 2410.12696 null
2024-10-16 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation Dewei Zhou et.al. 2410.12669 null
2024-10-16 Towards Designing Scalable Quantum-Enhanced Generative Networks for Neutrino Physics Experiments with Liquid Argon Time Projection Chambers Andrea Delgado et.al. 2410.12650 null
2024-10-16 A Robo-Advisor System: expected utility modeling via pairwise comparisons Bo Chen et.al. 2410.12570 null
2024-10-16 One Step Diffusion via Shortcut Models Kevin Frans et.al. 2410.12557 link
2024-10-16 Disentangling data distribution for Federated Learning Xinyuan Zhao et.al. 2410.12530 null
2024-10-16 Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing Mingce Guo et.al. 2410.12526 null
2024-10-16 MING: A Functional Approach to Learning Molecular Generative Models Van Khoa Nguyen et.al. 2410.12522 null
2024-10-15 High-Resolution Frame Interpolation with Patch-based Cascaded Diffusion Junhwa Hur et.al. 2410.11838 null
2024-10-15 On the Effectiveness of Dataset Alignment for Fake Image Detection Anirudh Sundara Rajan et.al. 2410.11835 null
2024-10-15 Bayesian Experimental Design via Contrastive Diffusions Jacopo Iollo et.al. 2410.11826 link
2024-10-15 KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities Hsin-Ping Huang et.al. 2410.11824 null
2024-10-15 Improving Long-Text Alignment for Text-to-Image Diffusion Models Luping Liu et.al. 2410.11817 link
2024-10-15 SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing Zhiyuan Zhang et.al. 2410.11815 null
2024-10-16 Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices Zhiyuan Ma et.al. 2410.11795 null
2024-10-15 G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks Guibin Zhang et.al. 2410.11782 null
2024-10-15 Technical Report of 1:10 Scale Autonomous Vehicle Robot Amirhossein Kheiri Holighi et.al. 2410.11746 null
2024-10-15 Probabilistic Principles for Biophysics and Neuroscience: Entropy Production, Bayesian Mechanics & the Free-Energy Principle Lancelot Da Costa et.al. 2410.11735 null
2024-10-15 Patch-Based Diffusion Models Beat Whole-Image Models for Mismatched Distribution Inverse Problems Jason Hu et.al. 2410.11730 null
2024-10-15 Parameter estimation of structural dynamics with neural operators enabled surrogate modeling Mingyuan Zhou et.al. 2410.11712 null
2024-10-15 Findings of the WMT 2024 Shared Task on Chat Translation Wafaa Mohammed et.al. 2410.11624 null
2024-10-15 DeformPAM: Data-Efficient Learning for Long-horizon Deformable Object Manipulation via Preference-based Action Alignment Wendi Chen et.al. 2410.11584 link
2024-10-15 A Data-Driven Aggressive Autonomous Racing Framework Utilizing Local Trajectory Planning with Velocity Prediction Zhouheng Li et.al. 2410.11570 link
2024-10-14 Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models Jingzhi Bao et.al. 2410.10821 link
2024-10-15 TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models Mu Cai et.al. 2410.10818 link
2024-10-14 LVD-2M: A Long-take Video Dataset with Temporally Dense Captions Tianwei Xiong et.al. 2410.10816 link
2024-10-14 Depth Any Video with Scalable Synthetic Data Honghui Yang et.al. 2410.10815 link
2024-10-14 HART: Efficient Visual Generation with Hybrid Autoregressive Transformer Haotian Tang et.al. 2410.10812 link
2024-10-14 TrajDiffuse: A Conditional Diffusion Model for Environment-Aware Trajectory Prediction Qingze et.al. 2410.10804 link
2024-10-14 Boosting Camera Motion Control for Video Diffusion Transformers Soon Yau Cheong et.al. 2410.10802 null
2024-10-14 Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations Litu Rout et.al. 2410.10792 null
2024-10-14 ControlMM: Controllable Masked Motion Generation Ekkasit Pinyoanuntapong et.al. 2410.10780 null
2024-10-14 Adaptive Diffusion Terrain Generator for Autonomous Uneven Terrain Navigation Youwei Yu et.al. 2410.10766 null
2024-10-14 DragEntity: Trajectory Guided Video Generation using Entity and Positional Relationships Zhang Wan et.al. 2410.10751 null
2024-10-14 CosForce: A Force-Based General Model for Simulating Pedestrian Anticipation and Reaction Mechanisms Jinghui Wang et.al. 2410.10746 null
2024-10-14 FlexGen: Flexible Multi-View Generation from Text and Image Inputs Xinli Xu et.al. 2410.10745 null
2024-10-14 Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models Junyu Chen et.al. 2410.10733 link
2024-10-14 Large Language Models Are Active Critics in NLG Evaluation Shuying Xu et.al. 2410.10724 null
2024-10-11 SceneCraft: Layout-Guided 3D Scene Generation Xiuyu Yang et.al. 2410.09049 link
2024-10-11 Linear Convergence of Diffusion Models Under the Manifold Hypothesis Peter Potaptchik et.al. 2410.09046 null
2024-10-11 PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents Xiangyu Yin et.al. 2410.09034 link
2024-10-11 Semantic Score Distillation Sampling for Compositional Text-to-3D Generation Ling Yang et.al. 2410.09009 link
2024-10-11 WaveDiffusion: Exploring Full Waveform Inversion via Joint Diffusion in the Latent Space Hanchen Wang et.al. 2410.09002 null
2024-10-11 Maximizing the Potential of Synthetic Data: Insights from Random Matrix Theory Aymane El Firdoussi et.al. 2410.08942 null
2024-10-11 DiffPO: A causal diffusion model for learning distributions of potential outcomes Yuchen Ma et.al. 2410.08924 null
2024-10-11 An End-to-End Deep Learning Method for Solving Nonlocal Allen-Cahn and Cahn-Hilliard Phase-Field Models Yuwei Geng et.al. 2410.08914 null
2024-10-11 Conditional Generative Models for Contrast-Enhanced Synthesis of T1w and T1 Maps in Brain MRI Moritz Piening et.al. 2410.08894 link
2024-10-11 MATCH: Model-Aware TVM-based Compilation for Heterogeneous Edge Devices Mohamed Amine Hamdi et.al. 2410.08855 link
2024-10-14 LIME-Eval: Rethinking Low-light Image Enhancement Evaluation via Object Detection Mingjia Li et.al. 2410.08810 link
2024-10-11 Bad Neighbors: On Understanding VPN Provider Networks Teemu Rytilahti et.al. 2410.08737 link
2024-10-11 5G as Enabler for Industrie 4.0 Use Cases: Challenges and Concepts M. Gundall et.al. 2410.08726 null
2024-10-11 Investigating Human-Computer Interaction and Visual Comprehension in Text Generation Process of Natural Language Generation Models Yunchao Wang et.al. 2410.08723 null
2024-10-11 Impact of Surface Reflections in Maritime Obstacle Detection Samed Yalçın et.al. 2410.08713 link
2024-10-10 LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts Anh-Quan Cao et.al. 2410.08211 null
2024-10-10 DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models Xiaoxiao He et.al. 2410.08207 null
2024-10-10 HybridBooth: Hybrid Prompt Inversion for Efficient Subject-Driven Generation Shanyan Guan et.al. 2410.08192 null
2024-10-10 DifFRelight: Diffusion-Based Facial Performance Relighting Mingming He et.al. 2410.08188 null
2024-10-10 RGM: Reconstructing High-fidelity 3D Car Assets with Relightable 3D-GS Generative Model from a Single Image Xiaoxue Chen et.al. 2410.08181 null
2024-10-10 ZeroComp: Zero-shot Object Compositing from Image Intrinsics via Diffusion Zitian Zhang et.al. 2410.08168 null
2024-10-10 DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation Jiatao Gu et.al. 2410.08159 null
2024-10-10 Progressive Autoregressive Video Diffusion Models Desai Xie et.al. 2410.08151 link
2024-10-10 Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction Jarrid Rector-Brooks et.al. 2410.08134 null
2024-10-10 Robust AI-Generated Text Detection by Restricted Embeddings Kristian Kuznetsov et.al. 2410.08113 link
2024-10-10 LiPO: LiDAR Inertial Odometry for ICP Comparison Darwin Mick et.al. 2410.08097 null
2024-10-10 Unstable Unlearning: The Hidden Risk of Concept Resurgence in Diffusion Models Vinith M. Suriyakumar et.al. 2410.08074 null
2024-10-10 Reversible Decoupling Network for Single Image Reflection Removal Hao Zhao et.al. 2410.08063 link
2024-10-10 A Target-Aware Analysis of Data Augmentation for Hate Speech Detection Camilla Casula et.al. 2410.08053 null
2024-10-10 LADIMO: Face Morph Generation through Biometric Template Inversion with Latent Diffusion Marcel Grimmer et.al. 2410.07988 link
2024-10-09 IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation Xinchen Zhang et.al. 2410.07171 link
2024-10-09 Sylber: Syllabic Embedding Representation of Speech from Raw Audio Cheol Jun Cho et.al. 2410.07168 link
2024-10-09 AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation Yukang Cao et.al. 2410.07164 null
2024-10-09 InstructG2I: Synthesizing Images from Multimodal Attributed Graphs Bowen Jin et.al. 2410.07157 link
2024-10-09 Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis Bohan Zeng et.al. 2410.07155 link
2024-10-10 EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models Rui Zhao et.al. 2410.07133 link
2024-10-09 Personalized Visual Instruction Tuning Renjie Pi et.al. 2410.07113 link
2024-10-09 A Gentle Introduction and Tutorial on Deep Generative Models in Transportation Research Seongjin Choi et.al. 2410.07066 link
2024-10-09 Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax Ivan Butakov et.al. 2410.06993 null
2024-10-09 Diffusion Density Estimators Akhil Premkumar et.al. 2410.06986 null
2024-10-09 Jointly Generating Multi-view Consistent PBR Textures using Collaborative Control Shimon Vainer et.al. 2410.06985 null
2024-10-09 Structure-Centric Robust Monocular Depth Estimation via Knowledge Distillation Runze Chen et.al. 2410.06982 null
2024-10-09 Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think Sihyun Yu et.al. 2410.06940 link
2024-10-09 VEC-Sim: A Simulation Platform for Evaluating Service Caching and Computation Offloading Policies in Vehicular Edge Networks Fan Wu et.al. 2410.06934 null
2024-10-09 Generative Model for Less-Resourced Language with 1 billion parameters Domen Vreš et.al. 2410.06898 null
2024-10-07 DART: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control Kaifeng Zhao et.al. 2410.05260 null
2024-10-07 GS-VTON: Controllable 3D Virtual Try-on with Gaussian Splatting Yukang Cao et.al. 2410.05259 null
2024-10-07 SePPO: Semi-Policy Preference Optimization for Diffusion Alignment Daoan Zhang et.al. 2410.05255 link
2024-10-07 DiffuseReg: Denoising Diffusion Model for Obtaining Deformation Fields in Unsupervised Deformable Image Registration Yongtai Zhuo et.al. 2410.05234 link
2024-10-07 Density estimation with LLMs: a geometric investigation of in-context learning trajectories Toni J. B. Liu et.al. 2410.05218 null
2024-10-07 Avoiding Deadlocks via Weak Deadlock Sets Gianpaolo Oriolo et.al. 2410.05175 null
2024-10-07 Presto! Distilling Steps and Layers for Accelerating Music Generation Zachary Novack et.al. 2410.05167 null
2024-10-08 A Simulation-Free Deep Learning Approach to Stochastic Optimal Control Mengjian Hua et.al. 2410.05163 null
2024-10-07 Smart Jamming Attack and Mitigation on Deep Transfer Reinforcement Learning Enabled Resource Allocation for Network Slicing Shavbo Salehi et.al. 2410.05153 null
2024-10-07 Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information Timofey Efimov et.al. 2410.05143 null
2024-10-07 Agnostic Smoothed Online Learning Moïse Blanchard et.al. 2410.05124 null
2024-10-07 Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning Ayano Hiranaka et.al. 2410.05116 null
2024-10-07 Synthetic Generation of Dermatoscopic Images with GAN and Closed-Form Factorization Rohan Reddy Mekala et.al. 2410.05114 null
2024-10-07 Hyper-Representations: Learning from Populations of Neural Networks Konstantin Schürholt et.al. 2410.05107 link
2024-10-07 DreamSat: Towards a General 3D Model for Novel View Synthesis of Space Objects Nidhi Mathihalli et.al. 2410.05097 link
2024-10-04 Estimating Body and Hand Motion in an Ego-sensed World Brent Yi et.al. 2410.03665 null
2024-10-04 Enhance Reasoning by Learning from Mistakes: Peer-Review Knowledge Distillation from Multiple Large Language Models Zhuochun Li et.al. 2410.03663 null
2024-10-04 Geometric Representation Condition Improves Equivariant Molecule Generation Zian Li et.al. 2410.03655 null
2024-10-04 Aligning LLMs with Individual Preferences via Interaction Shujin Wu et.al. 2410.03642 link
2024-10-04 Real-World Benchmarks Make Membership Inference Attacks Fail on Diffusion Models Chumeng Liang et.al. 2410.03640 link
2024-10-04 Conditional Enzyme Generation Using Protein Language Models with Adapters Jason Yang et.al. 2410.03634 null
2024-10-04 How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework Yinuo Ren et.al. 2410.03601 null
2024-10-04 Teaching Transformers Modular Arithmetic at Scale Eshika Saxena et.al. 2410.03569 null
2024-10-04 Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features Benyuan Meng et.al. 2410.03558 link
2024-10-04 Loading Ceramics: Visualising Possibilities of Robotics in Ceramics Varvara Guljajeva et.al. 2410.03550 null
2024-10-04 NRGBoost: Energy-Based Generative Boosted Trees João Bravo et.al. 2410.03535 null
2024-10-04 Generative Artificial Intelligence for Navigating Synthesizable Chemical Space Wenhao Gao et.al. 2410.03494 link
2024-10-04 SeBS-Flow: Benchmarking Serverless Cloud Function Workflows Larissa Schmid et.al. 2410.03480 null
2024-10-04 Formalizing MLTL Formula Progression in Isabelle/HOL Katherine Kosaian et.al. 2410.03465 null
2024-10-04 Diffusion State-Guided Projected Gradient for Inverse Problems Rayhan Zirvi et.al. 2410.03463 null
2024-10-03 SIEVE: General Purpose Data Filtering System Matching GPT-4o Accuracy at 1% the Cost Jifan Zhang et.al. 2410.02755 null
2024-10-03 CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation Han He et.al. 2410.02748 null
2024-10-03 Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization Lei Xu et.al. 2410.02741 link
2024-10-03 Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models Zhengfeng Lai et.al. 2410.02740 null
2024-10-03 Custom Non-Linear Model Predictive Control for Obstacle Avoidance in Indoor and Outdoor Environments Lara Laban et.al. 2410.02732 link
2024-10-03 A Photonic Parameter-shift Rule: Enabling Gradient Computation for Photonic Quantum Computers Axel Pappalardo et.al. 2410.02726 null
2024-10-03 AlzhiNet: Traversing from 2DCNN to 3DCNN, Towards Early Detection and Diagnosis of Alzheimer's Disease Romoke Grace Akindele et.al. 2410.02714 null
2024-10-03 SteerDiff: Steering towards Safe Text-to-Image Diffusion Models Hongxiang Zhang et.al. 2410.02710 null
2024-10-03 ControlAR: Controllable Image Generation with Autoregressive Models Zongming Li et.al. 2410.02705 link
2024-10-03 User-centric Immersive Communications in 6G: A Data-oriented Approach via Digital Twin Conghao Zhou et.al. 2410.02688 null
2024-10-03 GUD: Generation with Unified Diffusion Mathis Gerdes et.al. 2410.02667 null
2024-10-03 Grounded Answers for Multi-agent Decision-making Problem through Generative World Model Zeyang Liu et.al. 2410.02664 null
2024-10-03 Scalable Simulation-free Entropic Unbalanced Optimal Transport Jaemoo Choi et.al. 2410.02656 null
2024-10-03 Measuring and Improving Persuasiveness of Generative Models Somesh Singh et.al. 2410.02653 null
2024-10-03 Efficient calibration of the shifted square-root diffusion model to credit default swap spreads using asymptotic approximations Ankush Agarwal et.al. 2410.02645 null
2024-10-02 FabricDiffusion: High-Fidelity Texture Transfer for 3D Garments Generation from In-The-Wild Clothing Images Cheng Zhang et.al. 2410.01801 null
2024-10-02 Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space Yangming Li et.al. 2410.01796 null
2024-10-02 Dynamical-generative downscaling of climate model ensembles Ignacio Lopez-Gomez et.al. 2410.01776 null
2024-10-02 Towards deep learning sequence-structure co-generation for protein design Chentong Wang et.al. 2410.01773 null
2024-10-02 ImageFolder: Autoregressive Image Generation with Folded Tokens Xiang Li et.al. 2410.01756 link
2024-10-02 AssessITS: Integrating procedural guidelines and practical evaluation metrics for organizational IT and Cybersecurity risk assessment Mir Mehedi Rahman et.al. 2410.01750 null
2024-10-02 VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models Kailai Feng et.al. 2410.01738 link
2024-10-02 HarmoniCa: Harmonizing Training and Inference for Better Feature Cache in Diffusion Transformer Acceleration Yushi Huang et.al. 2410.01723 null
2024-10-02 Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective Zeyu Gan et.al. 2410.01720 link
2024-10-02 COMUNI: Decomposing Common and Unique Video Signals for Diffusion-based Video Generation Mingzhen Sun et.al. 2410.01718 null
2024-10-02 A Mathematics-Inspired Learning-to-Optimize Framework for Decentralized Optimization Yutong He et.al. 2410.01700 null
2024-10-02 Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding Yao Teng et.al. 2410.01699 link
2024-10-02 Lossy Semantic Communication for the Logical Deduction of the State of the World Ahmet Faruk Saz et.al. 2410.01676 link
2024-10-02 Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering Klaus-Rudolf Kladny et.al. 2410.01660 null
2024-10-02 On The Adaptation of Unlimiformer for Decoder-Only Transformers Kian Ahrabian et.al. 2410.01637 null
2024-09-30 SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes Tianchang Shen et.al. 2409.20562 null
2024-09-30 Annealing Flow Generative Model Towards Sampling High-Dimensional and Multi-Modal Distributions Dongze Wu et.al. 2409.20547 link
2024-09-30 A Compact Quantum Random Number Generator Based on Balanced Detection of Shot Noise Jaideep Singh et.al. 2409.20515 null
2024-09-30 NUTRIVISION: A System for Automatic Diet Management in Smart Healthcare Madhumita Veeramreddy et.al. 2409.20508 null
2024-09-30 COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models Divyanshu Daiya et.al. 2409.20502 null
2024-09-30 FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing Lingling Cai et.al. 2409.20500 null
2024-09-30 All-optical autoencoder machine learning framework using diffractive processors Peijie Feng et.al. 2409.20346 null
2024-09-30 Devil is in Details: Locality-Aware 3D Abdominal CT Volume Generation for Self-Supervised Organ Segmentation Yuran Wang et.al. 2409.20332 null
2024-09-30 UIR-LoRA: Achieving Universal Image Restoration through Multiple Low-Rank Adaptation Cheng Zhang et.al. 2409.20197 link
2024-09-30 Ensemble Kalman Diffusion Guidance: A Derivative-free Method for Inverse Problems Hongkai Zheng et.al. 2409.20175 null
2024-09-30 Erase, then Redraw: A Novel Data Augmentation Approach for Free Space Detection Using Diffusion Model Fulong Ma et.al. 2409.20164 null
2024-09-30 Conditional Diffusion Models are Minimax-Optimal and Manifold-Adaptive for Conditional Distribution Estimation Rong Tang et.al. 2409.20124 null
2024-09-30 Training a Computer Vision Model for Commercial Bakeries with Primarily Synthetic Images Thomas H. Schmitt et.al. 2409.20122 null
2024-09-30 Reaction-diffusion model for a population structured in phenotype and space I -- Criterion for persistence Nathanaël Boutillon et.al. 2409.20118 null
2024-09-30 Near-Field Coupling Coil System: A Novel Radiofrequency Coil Solution for MRI Zhiguang Mo et.al. 2409.20095 null
2024-09-27 $O(d/T)$ Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions Gen Li et.al. 2409.18959 null
2024-09-27 ReviveDiff: A Universal Diffusion Model for Restoring Images in Adverse Weather Conditions Wenfeng Huang et.al. 2409.18932 null
2024-09-27 Unsupervised Low-light Image Enhancement with Lookup Tables and Diffusion Priors Yunlong Lin et.al. 2409.18899 null
2024-09-27 Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis Songrui Wang et.al. 2409.18897 null
2024-09-27 HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models Yu Zhou et.al. 2409.18893 null
2024-09-27 Explainable Artifacts for Synthetic Western Blot Source Attribution João Phillipe Cardenuto et.al. 2409.18881 link
2024-09-27 Emu3: Next-Token Prediction is All You Need Xinlong Wang et.al. 2409.18869 null
2024-09-27 Challenges of Generating Structurally Diverse Graphs Fedor Velikonivtsev et.al. 2409.18859 link
2024-09-27 Moldable Development Patterns Oscar Nierstrasz et.al. 2409.18811 null
2024-09-27 Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions Iskander Azangulov et.al. 2409.18804 null
2024-09-27 Student-Oriented Teacher Knowledge Refinement for Knowledge Distillation Chaomin Shen et.al. 2409.18785 null
2024-09-27 Geometric deep learning for galaxy-halo connection: a case study for galaxy intrinsic alignments Yesukhei Jagvaral et.al. 2409.18761 null
2024-09-27 Cottention: Linear Transformers With Cosine Attention Gabriel Mongaras et.al. 2409.18747 link
2024-09-27 Read Over the Lines: Attacking LLMs and Toxicity Detection Systems with ASCII Art to Mask Profanity Sergey Berezin et.al. 2409.18708 link
2024-09-27 MG-Net: Learn to Customize QAOA with Circuit Depth Awareness Yang Qian et.al. 2409.18692 link
2024-09-26 FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner Wenliang Zhao et.al. 2409.18128 link
2024-09-26 Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction Jing He et.al. 2409.18124 null
2024-09-26 EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation Jiaxiang Tang et.al. 2409.18114 null
2024-09-26 MALPOLON: A Framework for Deep Species Distribution Modeling Theo Larcher et.al. 2409.18102 link
2024-09-26 StackGen: Generating Stable Structures from Silhouettes via Diffusion Luzhe Sun et.al. 2409.18098 null
2024-09-26 DiffSSC: Semantic LiDAR Scan Completion using Denoising Diffusion Probabilistic Models Helin Cao et.al. 2409.18092 null
2024-09-26 Stable Video Portraits Mirela Ostrek et.al. 2409.18083 null
2024-09-26 LightAvatar: Efficient Head Avatar as Dynamic Neural Light Field Huan Wang et.al. 2409.18057 link
2024-09-26 Automated Detection and Analysis of Power Words in Persuasive Text Using Natural Language Processing Sahil Garje et.al. 2409.18033 null
2024-09-26 PhoCoLens: Photorealistic and Consistent Reconstruction in Lensless Imaging Xin Cai et.al. 2409.17996 null
2024-09-26 Joint Localization and Planning using Diffusion L. Lao Beyer et.al. 2409.17995 null
2024-09-26 Manufacturing, processing, applications, and advancements of Fe-based shape memory alloys Anwar Algamal et.al. 2409.17973 null
2024-09-26 CNCA: Toward Customizable and Natural Generation of Adversarial Camouflage for Vehicle Detectors Linye Lyu et.al. 2409.17963 null
2024-09-26 Relativistic diffusion model for hadron production in p-Pb collisions at the LHC Philipp Schulz et.al. 2409.17960 null
2024-09-26 Perturb, Attend, Detect and Localize (PADL): Robust Proactive Image Defense Filippo Bartolucci et.al. 2409.17941 null
2024-09-25 DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion Yukun Huang et.al. 2409.17145 link
2024-09-25 Language-oriented Semantic Communication for Image Transmission with Fine-Tuned Diffusion Model Xinfeng Wei et.al. 2409.17104 null
2024-09-25 Accumulator-Aware Post-Training Quantization Ian Colbert et.al. 2409.17092 null
2024-09-25 Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification Xinrui Zhou et.al. 2409.17091 null
2024-09-25 Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors Aiping Zhang et.al. 2409.17058 link
2024-09-25 ControlCity: A Multimodal Diffusion Model Based Approach for Accurate Geospatial Data Generation and Urban Morphology Analysis Fangshuo Zhou et.al. 2409.17049 link
2024-09-25 GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering Design Phillip Mueller et.al. 2409.17045 null
2024-09-25 CNN Mixture-of-Depths Rinor Cakaj et.al. 2409.17016 null
2024-09-25 Single Image, Any Face: Generalisable 3D Face Generation Wenqing Wang et.al. 2409.16990 null
2024-09-25 Dynamic Obstacle Avoidance through Uncertainty-Based Adaptive Planning with Diffusion Vineet Punyamoorty et.al. 2409.16950 null
2024-09-25 DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling Kyuheon Jung et.al. 2409.16949 link
2024-09-25 Divergence asymmetry and connected components in a general duplication-divergence graph model Dario Borrelli et.al. 2409.16943 null
2024-09-25 Generative Object Insertion in Gaussian Splatting with a Multi-View Diffusion Model Hongliang Zhong et.al. 2409.16938 link
2024-09-25 Linking in Style: Understanding learned features in deep learning models Maren H. Wehrheim et.al. 2409.16865 link
2024-09-25 A Versatile and Differentiable Hand-Object Interaction Representation Théo Morales et.al. 2409.16855 null
2024-09-18 Massively Multi-Person 3D Human Motion Forecasting with Scene Context Felix B Mueller et.al. 2409.12189 link
2024-09-18 MoRAG -- Multi-Fusion Retrieval Augmented Generation for Human Motion Kalakonda Sai Shashank et.al. 2409.12140 null
2024-09-24 Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models Sijing Chen et.al. 2409.12139 null
2024-09-18 Brain-Streams: fMRI-to-Image Reconstruction with Multi-modal Guidance Jaehoon Joo et.al. 2409.12099 null
2024-09-19 Skill matching at scale: freelancer-project alignment for efficient multilingual candidate retrieval Warren Jouanneau et.al. 2409.12097 null
2024-09-18 Design of Ligand-Binding Proteins with Atomic Flow Matching Junqi Liu et.al. 2409.12080 null
2024-09-18 Denoising diffusion models for high-resolution microscopy image restoration Pamela Osuna-Vargas et.al. 2409.12078 null
2024-09-19 Using Large Language Models to Generate Clinical Trial Tables and Figures Yumeng Yang et.al. 2409.12046 null
2024-09-18 LEMON: Localized Editing with Mesh Optimization and Neural Shaders Furkan Mert Algan et.al. 2409.12024 null
2024-09-18 Promise and Peril of Collaborative Code Generation Models: Balancing Effectiveness and Memorization Zhi Chen et.al. 2409.12020 null
2024-09-18 Towards Global Localization using Multi-Modal Object-Instance Re-Identification Aneesh Chavan et.al. 2409.12002 link
2024-09-18 Tracking Any Point with Frame-Event Fusion Network at High Frame Rate Jiaxiong Liu et.al. 2409.11953 null
2024-09-18 Generation of Complex 3D Human Motion by Temporal and Spatial Composition of Diffusion Models Lorenzo Mandelli et.al. 2409.11920 null
2024-09-18 AlignBot: Aligning VLM-powered Customized Task Planning with User Reminders Through Fine-Tuning for Household Robots Zhaxizhuoma et.al. 2409.11905 null
2024-09-18 Finding the Subjective Truth: Collecting 2 Million Votes for Comprehensive Gen-AI Model Evaluation Dimitrios Christodoulou et.al. 2409.11904 null
2024-09-17 Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion Zhenwei Wang et.al. 2409.11406 null
2024-09-17 Teaching dark matter simulations to speak the halo language Shivam Pandey et.al. 2409.11401 link
2024-09-17 Ultrasound Image Enhancement with the Variance of Diffusion Models Yuxin Zhang et.al. 2409.11380 link
2024-09-17 OSV: One Step is Enough for High-Quality Image to Video Generation Xiaofeng Mao et.al. 2409.11367 null
2024-09-17 Ping! Your Food is Ready: Comparing Different Notification Techniques in 3D AR Cooking Environment Aditya Raikwar et.al. 2409.11357 null
2024-09-17 Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Gonzalo Martin Garcia et.al. 2409.11355 link
2024-09-17 OmniGen: Unified Image Generation Shitao Xiao et.al. 2409.11340 link
2024-09-17 fMRI-3D: A Comprehensive Dataset for Enhancing fMRI-based 3D Reconstruction Jianxiong Gao et.al. 2409.11315 null
2024-09-17 SpMis: An Investigation of Synthetic Spoken Misinformation Detection Peizhuo Liu et.al. 2409.11308 null
2024-09-17 Measurement of top-quark pair production in association with charm quarks in proton-proton collisions at $\sqrt{s}=13$ TeV with the ATLAS detector ATLAS Collaboration et.al. 2409.11305 null
2024-09-17 NirvaWave: An Accurate and Efficient Near Field Wave Propagation Simulator for 6G and Beyond Vahid Yazdnian et.al. 2409.11293 link
2024-09-17 DroneDiffusion: Robust Quadrotor Dynamics Learning with Diffusion Models Avirup Das et.al. 2409.11292 null
2024-09-17 Neural Networks for Vehicle Routing Problem László Kovács et.al. 2409.11290 null
2024-09-17 Attacking Slicing Network via Side-channel Reinforcement Learning Attack Wei Shao et.al. 2409.11258 null
2024-09-17 Learning Source Disentanglement in Neural Audio Codec Xiaoyu Bie et.al. 2409.11228 null
2024-09-16 Pennsieve - A Collaborative Platform for Translational Neuroscience and Beyond Zack Goldblum et.al. 2409.10509 null
2024-09-16 Torres funerarias chullpa en el valle del río Lauca: un primer análisis arqueoastronómico Alejandro Gangui et.al. 2409.10497 null
2024-09-16 Incorporating Classifier-Free Guidance in Diffusion Model-Based Recommendation Noah Buchanan et.al. 2409.10494 null
2024-09-16 SimInversion: A Simple Framework for Inversion-Based Text-to-Image Editing Qi Qian et.al. 2409.10476 null
2024-09-16 MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion Lehong Wu et.al. 2409.10473 null
2024-09-16 Signed Graph Autoencoder for Explainable and Polarization-Aware Network Embeddings Nikolaos Nakis et.al. 2409.10452 null
2024-09-16 Mamba-ST: State Space Model for Efficient Style Transfer Filippo Botti et.al. 2409.10385 link
2024-09-16 2D or not 2D: How Does the Dimensionality of Gesture Representation Affect 3D Co-Speech Gesture Generation? Téo Guichoux et.al. 2409.10357 null
2024-09-16 Taming Diffusion Models for Image Restoration: A Review Ziwei Luo et.al. 2409.10353 null
2024-09-16 MEGS: Morphological Evaluation of Galactic Structure Ufuk Çakır et.al. 2409.10346 link
2024-09-16 VAE-QWGAN: Improving Quantum GANs for High Resolution Image Generation Aaron Mark Thomas et.al. 2409.10339 null
2024-09-16 Research and Design of a Financial Intelligent Risk Control Platform Based on Big Data Analysis and Deep Machine Learning Shuochen Bi et.al. 2409.10331 null
2024-09-16 Fairness, not Emotion, Drives Socioeconomic Decision Making Rudra Mukhopadhyay et.al. 2409.10322 null
2024-09-16 On Synthetic Texture Datasets: Challenges, Creation, and Curation Blaine Hoak et.al. 2409.10297 null
2024-09-16 DreamHead: Learning Spatial-Temporal Correspondence via Hierarchical Diffusion for Audio-driven Talking Head Synthesis Fa-Ting Hong et.al. 2409.10281 null
2024-09-13 Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation Qingwen Bu et.al. 2409.09016 link
2024-09-13 A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis Yohan Poirier-Ginter et.al. 2409.08947 null
2024-09-13 Emerging Reliance Behaviors in Human-AI Text Generation: Hallucinations, Data Quality Assessment, and Cognitive Forcing Functions Zahra Ashktorab et.al. 2409.08937 null
2024-09-13 Latent Space Score-based Diffusion Model for Probabilistic Multivariate Time Series Imputation Guojun Liang et.al. 2409.08917 link
2024-09-13 Gaussian is All You Need: A Unified Framework for Solving Inverse Problems via Diffusion Posterior Sampling Nebiyou Yismaw et.al. 2409.08906 null
2024-09-13 Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control Carles Domingo-Enrich et.al. 2409.08861 null
2024-09-13 The Line-Based Dial-a-Ride Problem Kendra Reiter et.al. 2409.08860 link
2024-09-13 InstantDrag: Improving Interactivity in Drag-based Image Editing Joonghyuk Shin et.al. 2409.08857 null
2024-09-13 DX2CT: Diffusion Model for 3D CT Reconstruction from Bi or Mono-planar 2D X-ray(s) Yun Su Jeong et.al. 2409.08850 null
2024-09-13 Development of a Compton Imager Setup Anuraag Arya et.al. 2409.08822 null
2024-09-13 LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment Huan Zhang et.al. 2409.08795 link
2024-09-13 What You Say = What You Want? Teaching Humans to Articulate Requirements for LLMs Qianou Ma et.al. 2409.08775 link
2024-09-13 A Hybrid Meta-Learning and Multi-Armed Bandit Approach for Context-Specific Multi-Objective Recommendation Optimization Tiago Cunha et.al. 2409.08752 null
2024-09-13 Adaptive Sampling for Continuous Group Equivariant Neural Networks Berfin Inal et.al. 2409.08741 null
2024-09-13 DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset Jiawei Du et.al. 2409.08731 link
2024-09-12 DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors Thomas Hanwen Zhu et.al. 2409.08278 null
2024-09-12 Hand-Object Interaction Pretraining from Videos Himanshu Gaurav Singh et.al. 2409.08273 null
2024-09-12 Click2Mask: Local Editing with Dynamic Mask Generation Omer Regev et.al. 2409.08272 null
2024-09-12 DreamBeast: Distilling 3D Fantastical Animals with Part-Aware Knowledge Transfer Runjia Li et.al. 2409.08271 null
2024-09-12 Touch2Touch: Cross-Modal Tactile Generation for Object Manipulation Samanta Rodriguez et.al. 2409.08269 null
2024-09-12 Improving Text-guided Object Inpainting with Semantic Pre-inpainting Yifu Chen et.al. 2409.08260 link
2024-09-12 Improving Virtual Try-On with Garment-focused Diffusion Models Siqi Wan et.al. 2409.08258 null
2024-09-12 LoRID: Low-Rank Iterative Diffusion for Adversarial Purification Geigh Zollicoffer et.al. 2409.08255 null
2024-09-12 Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding Hongyu Li et.al. 2409.08251 null
2024-09-12 IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation Yinwei Wu et.al. 2409.08240 null
2024-09-12 Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources Alisia Lupidi et.al. 2409.08239 null
2024-09-12 LT3SD: Latent Trees for 3D Scene Diffusion Quan Meng et.al. 2409.08215 null
2024-09-12 VI3DRM:Towards meticulous 3D Reconstruction from Sparse Views via Photo-Realistic Novel View Synthesis Hao Chen et.al. 2409.08207 null
2024-09-12 High-Frequency Anti-DreamBooth: Robust Defense Against Image Synthesis Takuto Onikubo et.al. 2409.08167 link
2024-09-12 MagicStyle: Portrait Stylization Based on Reference Image Zhaoli Deng et.al. 2409.08156 null
2024-09-11 DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation Haibo Yang et.al. 2409.07454 null
2024-09-11 Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models Haibo Yang et.al. 2409.07452 link
2024-09-11 FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process Yang Luo et.al. 2409.07451 null
2024-09-11 Efficient One-Step Diffusion Refinement for Snapshot Compressive Imaging Yunzhen Wang et.al. 2409.07417 null
2024-09-11 Extracting TCPIP Headers at High Speed for the Anonymized Network Traffic Graph Challenge Zhaoyang Han et.al. 2409.07374 null
2024-09-11 Awaking the Slides: A Tuning-free and Knowledge-regulated AI Tutoring System via Language Model Coordination Daniel Zhang-Li et.al. 2409.07372 null
2024-09-11 Event-based Mosaicing Bundle Adjustment Shuang Guo et.al. 2409.07365 link
2024-09-11 Training-Free Guidance for Discrete Diffusion Models for Molecular Generation Thomas J. Kerby et.al. 2409.07359 null
2024-09-11 Learning Robotic Manipulation Policies from Point Clouds with Conditional Flow Matching Eugenio Chisari et.al. 2409.07343 null
2024-09-11 Efficient and Unbiased Sampling of Boltzmann Distributions via Consistency Models Fengzhe Zhang et.al. 2409.07323 null
2024-09-11 Optimizing Neural Network Performance and Interpretability with Diophantine Equation Encoding Ronald Katende et.al. 2409.07310 null
2024-09-11 Exploring User-level Gradient Inversion with a Diffusion Prior Zhuohang Li et.al. 2409.07291 null
2024-09-11 CCFExp: Facial Image Synthesis with Cycle Cross-Fusion Diffusion Model for Facial Paralysis Individuals Weixiang Gao et.al. 2409.07271 link
2024-09-11 Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models Sanoojan Baliah et.al. 2409.07269 link
2024-09-11 EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion Jian Zhang et.al. 2409.07255 null
2024-09-10 Technical Report of Mobile Manipulator Robot for Industrial Environments Erfan Amoozad Khalili et.al. 2409.06693 null
2024-09-10 SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation Teng Hu et.al. 2409.06633 null
2024-09-10 MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification Phu Pham et.al. 2409.06620 null
2024-09-10 A Primer on Variational Inference for Physics-Informed Deep Generative Modelling Alex Glyn-Davies et.al. 2409.06560 null
2024-09-10 From LIMA to DeepLIMA: following a new path of interoperability Victor Bocharov et.al. 2409.06550 null
2024-09-10 Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models Xin Jing et.al. 2409.06451 null
2024-09-10 Prompt2Fashion: An automatically generated fashion dataset Georgia Argyro et.al. 2409.06442 link
2024-09-10 Fast nonparametric inference of network backbones for graph sparsification Alec Kirkley et.al. 2409.06417 link
2024-09-10 Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition Junzheng Zhang et.al. 2409.06371 null
2024-09-10 What happens to diffusion model likelihood when your model is conditional? Mattias Cross et.al. 2409.06364 null
2024-09-10 DiffQRCoder: Diffusion-based Aesthetic QR Code Generation with Scanning Robustness Guided Iterative Refinement Jia-Wei Liao et.al. 2409.06355 null
2024-09-10 Improving Conditional Level Generation using Automated Validation in Match-3 Games Monica Villanueva Aylagas et.al. 2409.06349 null
2024-09-10 Foragax: An Agent Based Modelling framework based on JAX Siddharth Chaturvedi et.al. 2409.06345 link
2024-09-10 G3PT: Unleash the power of Autoregressive Modeling in 3D Generation via Cross-scale Querying Transformer Jinzhi Zhang et.al. 2409.06322 null
2024-09-10 Learning Augmentation Policies from A Model Zoo for Time Series Forecasting Haochen Yuan et.al. 2409.06282 null
2024-09-09 Fast Generation of Custom Floating-Point Spatial Filters on FPGAs Nelson Campos et.al. 2409.05837 null
2024-09-09 Enhancing Preference-based Linear Bandits via Human Response Time Shen Li et.al. 2409.05798 null
2024-09-09 Predicting Critical Heat Flux with Uncertainty Quantification and Domain Generalization Using Conditional Variational Autoencoders and Deep Neural Networks Farah Alsafadi et.al. 2409.05790 null
2024-09-09 Vector Quantized Diffusion Model Based Speech Bandwidth Extension Yuan Fang et.al. 2409.05784 null
2024-09-09 AS-Speech: Adaptive Style For Speech Synthesis Zhipeng Li et.al. 2409.05730 null
2024-09-09 pFedGPA: Diffusion-based Generative Parameter Aggregation for Personalized Federated Learning Jiahao Lai et.al. 2409.05701 null
2024-09-09 Citizen-Led Personalization of User Interfaces: Investigating How People Customize Interfaces for Themselves and Others Sérgio Alves et.al. 2409.05696 null
2024-09-09 Unlearning or Concealment? A Critical Analysis and Evaluation Metrics for Unlearning in Diffusion Models Aakash Sen Sharma et.al. 2409.05668 null
2024-09-09 Forward KL Regularized Preference Optimization for Aligning Diffusion Policies Zhao Shan et.al. 2409.05622 null
2024-09-09 CustomContrast: A Multilevel Contrastive Perspective For Subject-Driven Text-to-Image Customization Nan Chen et.al. 2409.05606 null
2024-09-09 Latent 3D Brain MRI Counterfactual Wei Peng et.al. 2409.05585 null
2024-09-09 Spatially-Aware Speaker for Vision-and-Language Navigation Instruction Generation Muraleekrishna Gopinathan et.al. 2409.05583 link
2024-09-09 Design and Implementation of TAO DAQ System Shuihan Zhang et.al. 2409.05522 null
2024-09-09 A Taxonomy of Miscompressions: Preparing Image Forensics for Neural Compression Nora Hofer et.al. 2409.05490 null
2024-09-09 DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation Wei Wu et.al. 2409.05463 null
2024-09-06 VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation Yecheng Wu et.al. 2409.04429 link
2024-09-06 Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques Davide Clode da Silva et.al. 2409.04424 null
2024-09-06 Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation Zhuoyan Luo et.al. 2409.04410 null
2024-09-06 Enhancing Skin Lesion Diagnosis with Ensemble Learning Xiaoyi Liu et.al. 2409.04381 null
2024-09-06 How Fair is Your Diffusion Recommender Model? Daniele Malitesta et.al. 2409.04339 null
2024-09-06 Random effects estimation in a fractional diffusion model based on continuous observations Nesrine Chebli et.al. 2409.04331 null
2024-09-06 Advancing Automated Knowledge Transfer in Evolutionary Multitasking via Large Language Models Yuxiao Huang et.al. 2409.04270 null
2024-09-06 An overview of domain-specific foundation model: key technologies, applications and challenges Haolong Chen et.al. 2409.04267 null
2024-09-06 UniDet3D: Multi-dataset Indoor 3D Object Detection Maksim Kolodiazhnyi et.al. 2409.04234 link
2024-09-06 Generative Modelling via Quantile Regression Johannes Schmidt-Hieber et.al. 2409.04231 null
2024-09-06 Breaking the Brownian Barrier: Models and Manifestations of Molecular Diffusion in Complex Fluids Harish Srinivasan et.al. 2409.04199 null
2024-09-06 GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers Lorenza Prospero et.al. 2409.04196 null
2024-09-06 Subsampling of Correlated Graph Signals Rishabh Ravi et.al. 2409.04107 null
2024-09-06 Estimation of service value parameters for a queue with unobserved balking Daniel Podorojnyi et.al. 2409.04090 null
2024-09-06 D4: Text-guided diffusion model-based domain adaptive data augmentation for vineyard shoot detection Kentaro Hirahara et.al. 2409.04060 null
2024-09-05 Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding Yunze Man et.al. 2409.03757 link
2024-09-05 WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild Yuntian Deng et.al. 2409.03753 null
2024-09-05 ArtiFade: Learning to Generate High-quality Subject from Blemished Images Shuya Yang et.al. 2409.03745 null
2024-09-06 RAG based Question-Answering for Contextual Response Prediction System Sriram Veturi et.al. 2409.03708 null
2024-09-05 RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images Benzhi Wang et.al. 2409.03644 link
2024-09-05 DiffEVC: Any-to-Any Emotion Voice Conversion with Expressive Guidance Hsing-Hang Chou et.al. 2409.03636 null
2024-09-05 Generalizing Linear Graphs and Bond Graph Models with Hetero-functional Graphs for System-of-Systems Engineering Applications Ehsanoddin Ghorbanichemazkati et.al. 2409.03630 null
2024-09-05 TCDiff: Triple Condition Diffusion Model with 3D Constraints for Stylizing Synthetic Faces Bernardo Biesseck et.al. 2409.03600 link
2024-09-05 DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture Qianlong Xiang et.al. 2409.03550 null
2024-09-05 Euclid preparation. Simulations and nonlinearities beyond $Λ$ CDM. 2. Results from non-standard simulations Euclid Collaboration et.al. 2409.03523 null
2024-09-05 Blended Latent Diffusion under Attention Control for Real-World Video Editing Deyin Liu et.al. 2409.03514 null
2024-09-05 Physical Modelling of Piano Sound Haifan Xie et.al. 2409.03481 null
2024-09-05 Data-free Distillation with Degradation-prompt Diffusion for Multi-weather Image Restoration Pei Wang et.al. 2409.03455 null
2024-09-05 Rx Strategist: Prescription Verification using LLM Agents System Phuc Phan Van et.al. 2409.03440 null
2024-09-05 KiloBot: A Programming Language for Deploying Perception-Guided Industrial Manipulators at Scale Wei Gao et.al. 2409.03439 null
2024-09-04 HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts Xinyu Liu et.al. 2409.02919 link
2024-09-04 Latent Watermarking of Audio Generative Models Robin San Roman et.al. 2409.02915 null
2024-09-04 Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling Kaiwen Zheng et.al. 2409.02908 null
2024-09-04 Configurable Foundation Models: Building LLMs from a Modular Perspective Chaojun Xiao et.al. 2409.02877 null
2024-09-04 Look Into the LITE in Deep Learning for Time Series Classification Ali Ismail-Fawaz et.al. 2409.02869 link
2024-09-04 Building a Scalable, Effective, and Steerable Search and Ranking Platform Marjan Celikik et.al. 2409.02856 null
2024-09-04 Human-VDM: Learning Single-Image 3D Human Gaussian Splatting from Video Diffusion Models Zhibin Liu et.al. 2409.02851 link
2024-09-04 Anomaly Detection in Offshore Open Radio Access Network Using Long Short-Term Memory Models on a Novel Artificial Intelligence-Driven Cloud-Native Data Platform Abdelrahim Ahmad et.al. 2409.02849 null
2024-09-04 Multi-Track MusicLDM: Towards Versatile Music Generation with Latent Diffusion Model Tornike Karchkhadze et.al. 2409.02845 null
2024-09-04 SNNAX -- Spiking Neural Networks in JAX Jamie Lohoff et.al. 2409.02842 null
2024-09-04 Experimental Framework for Generating Reliable Ground Truth for Laryngeal Spatial Segmentation Tasks Hamzeh Ghasemzadeh et.al. 2409.02809 null
2024-09-04 Creating a Gen-AI based Track and Trace Assistant MVP (SuperTracy) for PostNL Mohammad Reshadati et.al. 2409.02711 null
2024-09-04 Rethinking HTG Evaluation: Bridging Generation and Recognition Konstantina Nikolaidou et.al. 2409.02683 link
2024-09-04 Introduction to Machine Learning Laurent Younes et.al. 2409.02668 null
2024-09-04 Creating Domain-Specific Translation Memories for Machine Translation Fine-tuning: The TRENCARD Bilingual Cardiology Corpus Gokhan Dogru et.al. 2409.02667 null
2024-08-30 Generative AI Enables Medical Image Segmentation in Ultra Low-Data Regimes Li Zhang et.al. 2408.17421 link
2024-08-30 Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in the Environmental and Climate Change Domain Francesca Grasso et.al. 2408.17362 link
2024-08-30 Subspace Diffusion Posterior Sampling for Travel-Time Tomography Xiang Cao et.al. 2408.17333 null
2024-08-30 Structuring a Training Strategy to Robustify Perception Models with Realistic Image Augmentations Ahmed Hammam et.al. 2408.17311 null
2024-08-30 Leveraging Deep Generative Model For Computational Protein Design And Optimization Boqiao Lai et.al. 2408.17241 null
2024-08-30 Towards Symbolic XAI -- Explanation Through Human Understandable Logical Relationships Between Features Thomas Schnake et.al. 2408.17198 null
2024-09-02 Leveraging Blockchain and ANFIS for Optimal Supply Chain Management Amirfarhad Farhadi et.al. 2408.17161 null
2024-08-30 Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning Xiaoye Qu et.al. 2408.17150 link
2024-08-30 Flow Matching for Optimal Reaction Coordinates of Biomolecular System Mingyuan Zhang et.al. 2408.17139 link
2024-08-30 Temporal and Interactive Modeling for Efficient Human-Human Motion Generation Yabiao Wang et.al. 2408.17135 null
2024-09-02 RISSOLE: Parameter-efficient Diffusion Models via Block-wise Generation and Retrieval-Guidance Avideep Mukherjee et.al. 2408.17095 null
2024-08-30 FissionVAE: Federated Non-IID Image Generation with Latent Space and Decoder Decomposition Chen Hu et.al. 2408.17090 link
2024-08-30 Approximately Invertible Neural Network for Learned Image Compression Yanbo Gao et.al. 2408.17073 null
2024-09-02 Instant Adversarial Purification with Adversarial Consistency Distillation Chun Tong Lei et.al. 2408.17064 null
2024-08-30 Text-to-Image Generation Via Energy-Based CLIP Roy Ganz et.al. 2408.17046 null
2024-08-29 ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model Fangfu Liu et.al. 2408.16767 null
2024-08-29 CSGO: Content-Style Composition in Text-to-Image Generation Peng Xing et.al. 2408.16766 null
2024-08-29 A Score-Based Density Formula, with Applications in Diffusion Generative Models Gen Li et.al. 2408.16765 null
2024-08-29 UV-free Texture Generation with Denoising and Geodesic Heat Diffusions Simone Foti et.al. 2408.16762 link
2024-08-29 One-Shot Learning Meets Depth Diffusion in Multi-Object Videos Anisha Jain et.al. 2408.16704 null
2024-08-29 VMC: A Grammar for Visualizing Statistical Model Checks Ziyang Guo et.al. 2408.16702 null
2024-08-29 GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models Moreno D'Incà et.al. 2408.16700 link
2024-08-29 Optimization Models for the Quadratic Traveling Salesperson Problem Yuxiao Chen et.al. 2408.16680 null
2024-08-29 DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving Yongjie Fu et.al. 2408.16647 null
2024-08-29 RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model Zhuan Shi et.al. 2408.16634 null
2024-08-28 TEDRA: Text-based Editing of Dynamic and Photoreal Actors Basavaraj Sunagad et.al. 2408.15995 null
2024-08-28 Distribution Backtracking Builds A Faster Convergence Trajectory for One-step Diffusion Distillation Shengyuan Zhang et.al. 2408.15991 link
2024-08-28 Thoughtseeds: Evolutionary Priors, Nested Markov Blankets, and the Emergence of Embodied Cognition Prakash Chandra Kavi et.al. 2408.15982 null
2024-08-28 Stability of Primal-Dual Gradient Flow Dynamics for Multi-Block Convex Optimization Problems Ibrahim K. Ozaslan et.al. 2408.15969 null
2024-08-28 MetaGFN: Exploring Distant Modes with Adapted Metadynamics for Continuous GFlowNets Dominic Phillips et.al. 2408.15905 null
2024-08-28 Gen-Swarms: Adapting Deep Generative Models to Swarms of Drones Carlos Plou et.al. 2408.15899 null
2024-08-28 Airfoil Diffusion: Denoising Diffusion Model For Conditional Airfoil Generation Reid Graves et.al. 2408.15898 link
2024-08-28 Disentangled Diffusion Autoencoder for Harmonization of Multi-site Neuroimaging Data Ayodeji Ijishakin et.al. 2408.15890 null
2024-08-29 Recent Decade's Power Outage Data Reveals the Increasing Vulnerability of U.S. Power Infrastructure Bo Li et.al. 2408.15882 null
2024-08-28 GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model Yongjie Fu et.al. 2408.15868 null
2024-08-27 GenRec: Unifying Video Generation and Recognition with Diffusion Models Zejia Weng et.al. 2408.15241 link
2024-08-27 Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation Xiaojuan Wang et.al. 2408.15239 null
2024-08-27 Simulation of Stochastic Discrete Dislocation Dynamics in Ductile Vs Brittle Materials Santosh Chhetri et.al. 2408.15157 null
2024-08-27 How transformers learn structured data: insights from hierarchical filtering Jerome Garnier-Brun et.al. 2408.15138 link
2024-08-27 DIFR3CT: Latent Diffusion for Probabilistic 3D CT Reconstruction from Few Planar X-Rays Yiran Sun et.al. 2408.15118 link
2024-08-27 Data-Driven Nonlinear Deformation Design of 3D-Printable Shells Samuel Silverman et.al. 2408.15097 link
2024-08-27 Constrained Diffusion Models via Dual Training Shervin Khalafi et.al. 2408.15094 null
2024-08-27 LN-Gen: Rectal Lymph Nodes Generation via Anatomical Features Weidong Guo et.al. 2408.14977 null
2024-08-27 MegActor- $Σ$ : Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer Shurong Yang et.al. 2408.14975 null
2024-08-27 Integrated Bundling and Pricing of Unique Items Maxime Bouscary et.al. 2408.14913 null
2024-08-26 K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences Zhikai Li et.al. 2408.14468 null
2024-08-26 Uncovering Knowledge Gaps in Radiology Report Generation Models through Knowledge Graphs Xiaoman Zhang et.al. 2408.14397 link
2024-08-26 Reprogramming Foundational Large Language Models(LLMs) for Enterprise Adoption for Spatio-Temporal Forecasting Applications: Unveiling a New Era in Copilot-Guided Cross-Modal Time Series Representation Learning Sakhinana Sagar Srinivas et.al. 2408.14387 null
2024-08-26 GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy Peiyan Li et.al. 2408.14368 link
2024-08-27 Foundation Models for Music: A Survey Yinghao Ma et.al. 2408.14340 link
2024-08-26 Automated Machine Learning in Insurance Panyi Dong et.al. 2408.14331 link
2024-08-26 LLM-3D Print: Large Language Models To Monitor and Control 3D Printing Yayati Jadhav et.al. 2408.14307 null
2024-08-26 Learning Local Pattern Modularization for Point Cloud Reconstruction from Unseen Classes Chao Chen et.al. 2408.14279 null
2024-08-26 Towards Synthetic Trace Generation of Modeling Operations using In-Context Learning Approach Vittoriano Muttillo et.al. 2408.14259 null
2024-08-27 Text3DAug -- Prompted Instance Augmentation for LiDAR Perception Laurenz Reichardt et.al. 2408.14253 link
2024-08-23 How Diffusion Models Learn to Factorize and Compose Qiyao Liang et.al. 2408.13256 null
2024-08-23 Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption Sakhinana Sagar Srinivas et.al. 2408.13248 null
2024-08-23 CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities Tao Wu et.al. 2408.13239 null
2024-08-23 Social Welfare Maximization for Federated Learning with Network Effects Xiang Li et.al. 2408.13223 null
2024-08-23 Instruct-DeBERTa: A Hybrid Approach for Aspect-based Sentiment Analysis on Textual Reviews Dineth Jayakody et.al. 2408.13202 null
2024-08-23 IFH: a Diffusion Framework for Flexible Design of Graph Generative Models Samuel Cognolato et.al. 2408.13194 link
2024-08-23 Deep Learning for Lung Disease Classification Using Transfer Learning and a Customized CNN Architecture with Attention Xiaoyi Liu et.al. 2408.13180 null
2024-08-26 Focus on Neighbors and Know the Whole: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation Bonan Li et.al. 2408.13149 null
2024-08-23 Diffusion-based Episodes Augmentation for Offline Multi-Agent Reinforcement Learning Jihwan Oh et.al. 2408.13092 null
2024-08-23 General Intelligent Imaging and Uncertainty Quantification by Deterministic Diffusion Model Weiru Fan et.al. 2408.13061 null
2024-08-22 xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations Can Qin et.al. 2408.12590 null
2024-08-22 ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation Lujia Zhong et.al. 2408.12561 link
2024-08-22 Show-o: One Single Transformer to Unify Multimodal Understanding and Generation Jinheng Xie et.al. 2408.12528 null
2024-08-22 FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing Jue Wang et.al. 2408.12429 link
2024-08-22 Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification Sudi Murindanyi et.al. 2408.12426 null
2024-08-22 4D Diffusion for Dynamic Protein Structure Prediction with Reference Guided Motion Alignment Kaihui Cheng et.al. 2408.12419 null
2024-08-22 CODE: Confident Ordinary Differential Editing Bastien van Delft et.al. 2408.12418 link
2024-08-22 Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating Dynamic Behaviors and Physical Properties in Protein Structures Ce Liu et.al. 2408.12413 null
2024-08-22 A Stable Polygamy Approach to Spectrum Access with Channel Reuse Dan Ben Ami et.al. 2408.12402 null
2024-08-22 Multi-Style Facial Sketch Synthesis through Masked Generative Modeling Bowen Sun et.al. 2408.12400 null
2024-08-21 Pixel Is Not A Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models Chun-Yen Shih et.al. 2408.11810 null
2024-08-21 ACE: A Cross-Platform Visual-Exoskeletons System for Low-Cost Dexterous Teleoperation Shiqi Yang et.al. 2408.11805 null
2024-08-21 DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework Zhifei Xie et.al. 2408.11788 null
2024-08-21 Timeline and Boundary Guided Diffusion Network for Video Shadow Detection Haipeng Zhou et.al. 2408.11785 link
2024-08-21 Sum of Squares Circuits Lorenzo Loconte et.al. 2408.11778 null
2024-08-21 Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards Omar Erak et.al. 2408.11775 link
2024-08-21 D-RMGPT: Robot-assisted collaborative tasks driven by large multimodal models M. Forlini et.al. 2408.11761 null
2024-08-21 JieHua Paintings Style Feature Extracting Model using Stable Diffusion with ControlNet Yujia Gu et.al. 2408.11744 null
2024-08-21 Enhancing Cross-Modal Medical Image Segmentation through Compositionality Aniek Eijpe et.al. 2408.11733 link
2024-08-21 AI-assisted Automated Short Answer Grading of Handwritten University Level Mathematics Exams Tianyi Liu et.al. 2408.11728 null
2024-08-20 Reconciling Methodological Paradigms: Employing Large Language Models as Novice Qualitative Research Assistants in Talent Management Research Sreyoshi Bhaduri et.al. 2408.11043 null
2024-08-20 Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Chunting Zhou et.al. 2408.11039 null
2024-08-20 Full Detector Simulation of a Projective Dual-Readout Segmented Crystal Electromagnetic Calorimeter with Precision Timing Wonyong Chung et.al. 2408.11027 null
2024-08-20 MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning Haoning Wu et.al. 2408.11001 link
2024-08-20 GreediRIS: Scalable Influence Maximization using Distributed Streaming Maximum Cover Reet Barik et.al. 2408.10982 null
2024-08-21 Assortment Optimization Under History-Dependent Effects Taotao He et.al. 2408.10967 null
2024-08-20 Kilometer-Scale Convection Allowing Model Emulation using Generative Diffusion Modeling Jaideep Pathak et.al. 2408.10958 null
2024-08-20 SysBench: Can Large Language Models Follow System Messages? Yanzhao Qin et.al. 2408.10943 link
2024-08-20 A Closer Look at Data Augmentation Strategies for Finetuning-Based Low/Few-Shot Object Detection Vladislav Li et.al. 2408.10940 null
2024-08-20 Large Point-to-Gaussian Model for Image-to-3D Generation Longfei Lu et.al. 2408.10935 null
2024-08-19 MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model Minghua Liu et.al. 2408.10198 null
2024-08-19 SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views Chao Xu et.al. 2408.10195 null
2024-08-19 Customizing Language Models with Instance-wise LoRA for Sequential Recommendation Xiaoyu Kong et.al. 2408.10159 link
2024-08-19 Advancing Voice Cloning for Nepali: Leveraging Transfer Learning in a Low-Resource Language Manjil Karki et.al. 2408.10128 null
2024-08-19 Learning Precise Affordances from Egocentric Videos for Robotic Manipulation Gen Li et.al. 2408.10123 null
2024-08-19 Convert and Speak: Zero-shot Accent Conversion with Minimum Supervision Zhijun Jia et.al. 2408.10096 null
2024-08-19 Stacked Intelligent Metasurfaces for Integrated Sensing and Communications Haoxian Niu et.al. 2408.10043 null
2024-08-19 General Impedance Modeling for Modular Multilevel Converter with Grid-forming and Grid-following Control Chu Sun et.al. 2408.10017 null
2024-08-19 Uniting contrastive and generative learning for event sequences models Aleksandr Yugay et.al. 2408.09995 null
2024-08-19 Multi-layer diffusion model of photovoltaic installations Tomasz Weron et.al. 2408.09904 null
2024-08-16 Automated High-throughput Organic Crystal Structure Prediction via Population-based Sampling Qiang Zhu et.al. 2408.08843 link
2024-08-16 PFDiff: Training-free Acceleration of Diffusion Models through the Gradient Guidance of Past and Future Guangyi Wang et.al. 2408.08822 null
2024-08-16 A Unified Automata-Theoretic Approach to LTLf Modulo Theories (Extended Version) Marco Faella et.al. 2408.08817 null
2024-08-16 EmoDynamiX: Emotional Support Dialogue Strategy Prediction by Modelling MiXed Emotions and Discourse Dynamics Chenwei Wan et.al. 2408.08782 link
2024-08-16 Comparative Analysis of Generative Models: Enhancing Image Synthesis with VAEs, GANs, and Stable Diffusion Sanchayan Vivekananthan et.al. 2408.08751 null
2024-08-16 The Blessing of Strategic Customers in Personalized Pricing Zhi Chen et.al. 2408.08738 null
2024-08-16 ChatZero:Zero-shot Cross-Lingual Dialogue Generation via Pseudo-Target Language Yongkang Liu et.al. 2408.08724 null
2024-08-16 An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation Peiming Guo et.al. 2408.08650 null
2024-08-16 Modeling the Neonatal Brain Development Using Implicit Neural Representations Florentin Bieder et.al. 2408.08647 link
2024-08-16 Sampling effects on Lasso estimation of drift functions in high-dimensional diffusion processes Chiara Amorino et.al. 2408.08638 null
2024-08-15 Understanding the Local Geometry of Generative Model Manifolds Ahmed Imtiaz Humayun et.al. 2408.08307 null
2024-08-15 Accelerated Image-Aware Generative Diffusion Modeling Tanmay Asthana et.al. 2408.08306 null
2024-08-15 Marker or Markerless? Mode-Switchable Optical Tactile Sensing for Diverse Robot Tasks Ni Ou et.al. 2408.08276 null
2024-08-15 mhGPT: A Lightweight Generative Pre-Trained Transformer for Mental Health Text Analysis Dae-young Kim et.al. 2408.08261 null
2024-08-15 Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding Xiner Li et.al. 2408.08252 link
2024-08-15 Picosecond laser pulses for quantum dot-microcavity based single photon generation by cascaded electro-optic modulation of a narrow-linewidth laser Mio Poortvliet et.al. 2408.08213 null
2024-08-15 Not Every Image is Worth a Thousand Words: Quantifying Originality in Stable Diffusion Adi Haviv et.al. 2408.08184 null
2024-08-15 Impact of Comprehensive Data Preprocessing on Predictive Modelling of COVID-19 Mortality Sangita Das et.al. 2408.08142 link
2024-08-15 Decoding Memes: A Comparative Study of Machine Learning Models for Template Identification Levente Murgás et.al. 2408.08126 link
2024-08-15 When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding Pingping Zhang et.al. 2408.08093 null
2024-08-14 Detecting Near-Duplicate Face Images Sudipta Banerjee et.al. 2408.07689 link
2024-08-14 Composing Automatic Differentiation with Custom Derivatives of Higher-Order Functions Sam Estep et.al. 2408.07683 null
2024-08-14 Drug Discovery SMILES-to-Pharmacokinetics Diffusion Models with Deep Molecular Understanding Bing Hu et.al. 2408.07636 null
2024-08-14 Anisotropic Diffusion Model of Communication in 2D Biofilm Yanahan Paramalingam et.al. 2408.07626 null
2024-08-14 Neural Quantum States and Peaked Molecular Wave Functions: Curse or Blessing? Aleksei Malyshev et.al. 2408.07625 null
2024-08-14 MatterGPT: A Generative Transformer for Multi-Property Inverse Design of Solid-State Materials Yan Chen et.al. 2408.07608 null
2024-08-14 PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation Sang-Hoon Lee et.al. 2408.07547 link
2024-08-14 New Curriculum, New Chance -- Retrieval Augmented Generation for Lesson Planning in Ugandan Secondary Schools. Prototype Quality Evaluation Simon Kloker et.al. 2408.07542 null
2024-08-14 DifuzCam: Replacing Camera Lens with a Mask and a Diffusion Model Erez Yosef et.al. 2408.07541 null
2024-08-14 Towards Real-time Video Compressive Sensing on Mobile Devices Miao Cao et.al. 2408.07530 link
2024-08-13 Imagen 3 Imagen-Team-Google et.al. 2408.07009 null
2024-08-13 Low-Bitwidth Floating Point Quantization for Efficient High-Quality Diffusion Models Cheng Chen et.al. 2408.06995 null
2024-08-13 DCMSA: Multi-Head Self-Attention Mechanism Based on Deformable Convolution For Seismic Data Denoising Wang Mingwei et.al. 2408.06963 null
2024-08-13 Neural Speech and Audio Coding Minje Kim et.al. 2408.06954 null
2024-08-13 Diffusion Model for Slate Recommendation Federico Tomasi et.al. 2408.06883 null
2024-08-13 Efficient Search for Customized Activation Functions with Gradient Descent Lukas Strack et.al. 2408.06820 link
2024-08-13 Enhancing Diabetic Retinopathy Diagnosis: A Lightweight CNN Architecture for Efficient Exudate Detection in Retinal Fundus Images Mujadded Al Rabbani Alif et.al. 2408.06784 null
2024-08-13 Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective Ouxiang Li et.al. 2408.06741 link
2024-08-13 DiffLoRA: Generating Personalized Low-Rank Adaptation Weights with Diffusion Yujia Wu et.al. 2408.06740 null
2024-08-13 Multimodal Analysis of White Blood Cell Differentiation in Acute Myeloid Leukemia Patients using a β-Variational Autoencoder Gizem Mert et.al. 2408.06720 null
2024-08-12 The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Chris Lu et.al. 2408.06292 link
2024-08-12 Open-Source Molecular Processing Pipeline for Generating Molecules Shreyas V et.al. 2408.06261 null
2024-08-12 3D Reconstruction of Protein Structures from Multi-view AFM Images using Neural Radiance Fields (NeRFs) Jaydeep Rade et.al. 2408.06244 null
2024-08-12 Cislunar Constellation Design for Space Situational Awareness with Time-Expanded Facility Location Problem Yuri Shimane et.al. 2408.06238 null
2024-08-12 Novel View Synthesis from a Single Image with Pretrained Diffusion Guidance Taewon Kang et.al. 2408.06157 null
2024-08-12 LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library Tianhao Yu et.al. 2408.06150 null
2024-08-12 Efficient and Scalable Point Cloud Generation with Sparse Point-Voxel Diffusion Models Ioannis Romanelis et.al. 2408.06145 link
2024-08-12 Med42-v2: A Suite of Clinical LLMs Clément Christophe et.al. 2408.06142 null
2024-08-12 Five Pitfalls When Assessing Synthetic Medical Images with Reference Metrics Melanie Dohmen et.al. 2408.06075 null
2024-08-12 CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer Zhuoyi Yang et.al. 2408.06072 link
2024-08-09 Multi-Garment Customized Model Generation Yichen Liu et.al. 2408.05206 null
2024-08-09 TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning Yujie Feng et.al. 2408.05200 link
2024-08-09 Cell Morphology-Guided Small Molecule Generation with GFlowNets Stephen Zhewen Lu et.al. 2408.05196 link
2024-08-09 Lithography-free patterning of chalcogenide materials for integrated photonic devices Zhen Hu et.al. 2408.05099 null
2024-08-09 Social contagion under hybrid interactions Xincheng Shu et.al. 2408.05050 null
2024-08-09 Infrared Beam-shaping on Demand via Tailored Geometric Phase Metasurfaces employing the Plasmonic Phase-Change Material In3SbTe2 Lukas Conrads et.al. 2408.05044 null
2024-08-09 Collaborative Static-Dynamic Teaching: A Semi-Supervised Framework for Stripe-Like Space Target Detection Zijian Zhu et.al. 2408.05029 null
2024-08-09 Retrieval-augmented code completion for local projects using large language models Marko Hostnik et.al. 2408.05026 null
2024-08-09 DreamCouple: Exploring High Quality Text-to-3D Generation Via Rectified Flow Hangyu Li et.al. 2408.05008 null
2024-08-09 Pay Attention To Mean Fields For Point Cloud Generation Benno Käch et.al. 2408.04997 link
2024-08-08 Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics Ruining Li et.al. 2408.04631 null
2024-08-08 Transformer Explainer: Interactive Learning of Text-Generative Models Aeree Cho et.al. 2408.04619 null
2024-08-08 Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches Yongzhi Xu et.al. 2408.04567 null
2024-08-08 Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models Yupeng Chang et.al. 2408.04556 link
2024-08-08 On the Asymptotic Convergence of Subgraph Generated Models Xinchen Xu et.al. 2408.04541 null
2024-08-08 AExGym: Benchmarks and Environments for Adaptive Experimentation Jimmy Wang et.al. 2408.04531 null
2024-08-08 NFDI4Health workflow and service for synthetic data generation, assessment and risk management Sobhan Moazemi et.al. 2408.04478 null
2024-08-08 Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations Julen Urain et.al. 2408.04380 null
2024-08-08 Making sense of AI systems development Mateusz Dolata et.al. 2408.04311 null
2024-08-08 AI-Driven Chatbot for Intrusion Detection in Edge Networks: Enhancing Cybersecurity with Ethical User Consent Mugheez Asif et.al. 2408.04281 null
2024-08-07 Prospects for using drones to test formation-flying CubeSat concepts, and other astronomical applications John D. Monnier et.al. 2408.03911 null
2024-08-07 Hate Speech Detection and Classification in Amharic Text with Deep Learning Samuel Minale Gashe et.al. 2408.03849 null
2024-08-07 WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models Prannaya Gupta et.al. 2408.03837 link
2024-08-07 A broken duet: multistable dynamics of dyadic interactions Johan Medrano et.al. 2408.03809 link
2024-08-07 Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning Martin Moder et.al. 2408.03807 link
2024-08-07 Data Generation Scheme for Thermal Modality with Edge-Guided Adversarial Conditional Diffusion Model Guoqing Zhu et.al. 2408.03748 link
2024-08-07 Local Topology Measures of Contextual Language Model Latent Spaces With Applications to Dialogue Term Extraction Benjamin Matthias Ruppik et.al. 2408.03706 null
2024-08-07 Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling Zilyu Ye et.al. 2408.03695 link
2024-08-07 Unsupervised Detection of Fetal Brain Anomalies using Denoising Diffusion Models Markus Ditlev Sjøgren Olsen et.al. 2408.03654 null
2024-08-07 Goal-oriented Semantic Communication for the Metaverse Application Zhe Wang et.al. 2408.03646 null
2024-08-06 MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation Xiaofeng Mao et.al. 2408.03312 null
2024-08-06 IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts Ciara Rowles et.al. 2408.03209 null
2024-08-06 Personalizing Federated Instrument Segmentation with Visual Trait Priors in Robotic Surgery Jialang Xu et.al. 2408.03208 null
2024-08-06 An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion Xingguang Yan et.al. 2408.03178 null
2024-08-06 Iterative CT Reconstruction via Latent Variable Optimization of Shallow Diffusion Models Sho Ozaki et.al. 2408.03156 null
2024-08-06 Enhancing Twitter Bot Detection via Multimodal Invariant Representations Jibing Gong et.al. 2408.03096 null
2024-08-06 Analysis of Argument Structure Constructions in a Deep Recurrent Language Model Pegah Ramezani et.al. 2408.03062 null
2024-08-06 OpenOmni: A Collaborative Open Source Tool for Building Future-Ready Multimodal Conversational Agents Qiang Sun et.al. 2408.03047 link
2024-08-06 Targeted Visual Prompting for Medical Visual Question Answering Sergio Tascon-Morales et.al. 2408.03043 link
2024-08-06 Training-Free Condition Video Diffusion Models for single frame Spatial-Semantic Echocardiogram Synthesis Van Phi Nguyen et.al. 2408.03035 link
2024-08-05 Command-line Obfuscation Detection using Small Language Models Vojtech Outrata et.al. 2408.02637 null
2024-08-05 VidGen-1M: A Large-Scale Dataset for Text-to-video Generation Zhiyu Tan et.al. 2408.02629 null
2024-08-05 YOWOv3: An Efficient and Generalized Framework for Human Action Detection and Recognition Duc Manh Nguyen Dang et.al. 2408.02623 link
2024-08-05 LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba Yunxiang Fu et.al. 2408.02615 link
2024-08-05 MetaParticles: Computationally engineered nanomaterials with tunable and responsive properties Massimiliano Paesani et.al. 2408.02564 null
2024-08-05 Fairness and Bias Mitigation in Computer Vision: A Survey Sepehr Dehdashtian et.al. 2408.02464 null
2024-08-05 TGS: Trajectory Generation and Selection using Vision Language Models in Mapless Outdoor Environments Daeun Song et.al. 2408.02454 null
2024-08-05 Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models Zi Liang et.al. 2408.02416 link
2024-08-05 Multi-weather Cross-view Geo-localization Using Denoising Diffusion Models Tongtong Feng et.al. 2408.02408 null
2024-08-05 A Few-Shot Approach for Relation Extraction Domain Adaptation using Large Language Models Vanni Zavarella et.al. 2408.02377 null
2024-08-02 Conditional LoRA Parameter Generation Xiaolong Jin et.al. 2408.01415 null
2024-08-02 Autoencoders in Function Space Justin Bunker et.al. 2408.01362 link
2024-08-02 MCGMark: An Encodable and Robust Online Watermark for LLM-Generated Malicious Code Kaiwen Ning et.al. 2408.01354 link
2024-08-02 TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling Dong Huo et.al. 2408.01291 null
2024-08-02 A General Framework to Boost 3D GS Initialization for Text-to-3D Generation by Lexical Richness Lutao Jiang et.al. 2408.01269 null
2024-08-02 Exchange control in a MOS double quantum dot made using a 300 mm wafer process Jacob F. Chittock-Wood et.al. 2408.01241 null
2024-08-02 CLIP4Sketch: Enhancing Sketch to Mugshot Matching through Dataset Augmentation using Diffusion Models Kushal Kumar Jain et.al. 2408.01233 null
2024-08-02 Reality Fusion: Robust Real-time Immersive Mobile Robot Teleoperation with Volumetric Visual Data Fusion Ke Li et.al. 2408.01225 link
2024-08-02 PSP-GEN: Stochastic inversion of the Process-Structure-Property chain in materials design through deep, generative probabilistic modeling Yaohua Zang et.al. 2408.01114 null
2024-08-02 Six Dragons Fly Again: Reviving 15th-Century Korean Court Music with Transformers and Novel Encoding Danbinaerin Han et.al. 2408.01096 link
2024-08-01 Optimizing Diffusion Models for Joint Trajectory Prediction and Controllable Generation Yixiao Wang et.al. 2408.00766 null
2024-08-01 Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention Susung Hong et.al. 2408.00760 link
2024-08-01 DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency Jovan Stojkovic et.al. 2408.00741 null
2024-08-01 TurboEdit: Text-Based Image Editing Using Few-Step Diffusion Models Gilad Deutch et.al. 2408.00735 null
2024-08-01 A Natural Language Processing Framework for Hotel Recommendation Based on Users' Text Reviews Lavrentia Aravani et.al. 2408.00716 null
2024-08-02 Reinforcement Learning applied to Insurance Portfolio Pursuit Edward James Young et.al. 2408.00713 link
2024-08-01 MotionFix: Text-Driven 3D Human Motion Editing Nikos Athanasiou et.al. 2408.00712 null
2024-08-01 Synthetic dual image generation for reduction of labeling efforts in semantic segmentation of micrographs with a customized metric function Matias Oscar Volman Stern et.al. 2408.00707 null
2024-08-01 AutoM3L: An Automated Multimodal Machine Learning Framework with Large Language Models Daqin Luo et.al. 2408.00665 link
2024-08-01 Privacy-preserving datasets by capturing feature distributions with Conditional VAEs Francesco Di Salvo et.al. 2408.00639 link
2024-07-31 Detecting, Explaining, and Mitigating Memorization in Diffusion Models Yuxin Wen et.al. 2407.21720 link
2024-07-31 Tora: Trajectory-oriented Diffusion Transformer for Video Generation Zhenghao Zhang et.al. 2407.21705 link
2024-07-31 Generative Diffusion Model for Seismic Imaging Improvement of Sparsely Acquired Data and Uncertainty Quantification Xingchen Shi et.al. 2407.21683 null
2024-07-31 Quality Control for Radiology Report Generation Models via Auxiliary Auditing Components Hermione Warr et.al. 2407.21638 null
2024-07-31 LLM-for-X: Application-agnostic Integration of Large Language Models to Support Personal Writing Workflows Lukas Teufelberger et.al. 2407.21593 null
2024-07-31 Long-term investment and energy procurement risk management under uncertainty for an electrolytic green hydrogen producer Owen Palmer et.al. 2407.21574 null
2024-07-31 Conditioned Prompt-Optimization for Continual Deepfake Detection Francesco Laiti et.al. 2407.21554 link
2024-07-31 CXSimulator: A User Behavior Simulation using LLM Embeddings for Web-Marketing Campaign Assessment Akira Kasuga et.al. 2407.21553 null
2024-07-31 Explainable and Controllable Motion Curve Guided Cardiac Ultrasound Video Generation Junxuan Yu et.al. 2407.21490 null
2024-07-31 Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends Giuliano Martinelli et.al. 2407.21489 link
2024-07-30 Matting by Generation Zhixiang Wang et.al. 2407.21017 null
2024-07-30 Add-SD: Rational Generation without Manual Reference Lingfeng Yang et.al. 2407.21016 link
2024-07-30 Integrating Agent-Based and Compartmental Models for Infectious Disease Modeling: A Novel Hybrid Approach Inan Bostanci et.al. 2407.20993 null
2024-07-30 MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions Xiaowei Chi et.al. 2407.20962 link
2024-07-30 Mitigating calibration errors from mutual coupling with time-domain filtering of 21 cm cosmological radio observations N. Charles et.al. 2407.20923 null
2024-07-30 Impact of Geographical Separation on Spectrum Sharing Markets Kangle Mu et.al. 2407.20909 null
2024-07-30 Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering Yanpeng Zhao et.al. 2407.20908 link
2024-07-30 Vulnerabilities in AI-generated Image Detection: The Challenge of Adversarial Attacks Yunfeng Diao et.al. 2407.20836 null
2024-07-30 Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning Norman Di Palo et.al. 2407.20798 null
2024-07-30 SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models Zheng Liu et.al. 2407.20756 link
2024-07-29 Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing Ekaterina Iakovleva et.al. 2407.20232 null
2024-07-29 LatentArtiFusion: An Effective and Efficient Histological Artifacts Restoration Framework Zhenqi He et.al. 2407.20172 link
2024-07-29 Diffusion Feedback Helps CLIP See Better Wenxuan Wang et.al. 2407.20171 link
2024-07-29 DDAP: Dual-Domain Anti-Personalization against Text-to-Image Diffusion Models Jing Yang et.al. 2407.20141 null
2024-07-29 Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning Liyuan Mao et.al. 2407.20109 null
2024-07-29 On the significance of parameters and the projective level in the Choice and Collection axioms Vladimir Kanovei et.al. 2407.20098 null
2024-07-29 Generative Diffusion Model Bootstraps Zero-shot Classification of Fetal Ultrasound Images In Underrepresented African Populations Fangyijie Wang et.al. 2407.20072 link
2024-07-29 ImagiNet: A Multi-Content Dataset for Generalizable Synthetic Image Detection via Contrastive Learning Delyan Boychev et.al. 2407.20020 link
2024-07-29 Reproducibility Study of "ITI-GEN: Inclusive Text-to-Image Generation" Daniel Gallo Fernández et.al. 2407.19996 link
2024-07-29 HeadsetOff: Enabling Photorealistic Video Conferencing on Economical VR Headsets Yili Jin et.al. 2407.19988 null
2024-07-26 Generative Adversarial Networks for Imputing Sparse Learning Performance Liang Zhang et.al. 2407.18875 null
2024-07-26 Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment Yuze Zheng et.al. 2407.18854 null
2024-07-26 Scalable Group Choreography via Variational Phase Manifold Learning Nhat Le et.al. 2407.18839 null
2024-07-26 Revision of calcium and scandium abundances in Am stars based on NLTE calculations and comparison with diffusion stellar evolution models L. I. Mashonkina et.al. 2407.18736 null
2024-07-26 BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation Peng Hao et.al. 2407.18715 null
2024-07-26 Q-gen: A Parameterized Quantum Circuit Generator Yikai Mao et.al. 2407.18697 link
2024-07-26 Adversarial Robustification via Text-to-Image Diffusion Models Daewon Choi et.al. 2407.18658 link
2024-07-26 Robust VAEs via Generating Process of Noise Augmented Data Hiroo Irobe et.al. 2407.18632 null
2024-07-26 Denoising Lévy Probabilistic Models Dario Shariatian et.al. 2407.18609 link
2024-07-26 How To Segment in 3D Using 2D Models: Automated 3D Segmentation of Prostate Cancer Metastatic Lesions on PET Volumes Using Multi-Angle Maximum Intensity Projections and Diffusion Models Amirhosein Toosi et.al. 2407.18555 link
2024-07-25 RegionDrag: Fast Region-Based Image Editing with Diffusion Models Jingyi Lu et.al. 2407.18247 null
2024-07-25 VGGHeads: A Large-Scale Synthetic Dataset for 3D Human Heads Orest Kupyn et.al. 2407.18245 link
2024-07-25 CodedVO: Coded Visual Odometry Sachin Shah et.al. 2407.18240 null
2024-07-25 SuperFlow: A Fully-Customized RTL-to-GDS Design Automation Flow for Adiabatic Quantum-Flux-Parametron Superconducting Circuits Yanyue Xie et.al. 2407.18209 null
2024-07-25 Test2VA: Reusing GUI Test Cases for Voice Assistant Features Development in Mobile Applications Garrett Weaver et.al. 2407.18155 null
2024-07-25 Self-supervised pre-training with diffusion model for few-shot landmark detection in x-ray images Roberto Di Via et.al. 2407.18125 null
2024-07-25 Keypoint Promptable Re-Identification Vladimir Somers et.al. 2407.18112 link
2024-07-25 SSTD: Stripe-Like Space Target Detection using Single-Point Supervision Zijian Zhu et.al. 2407.18097 null
2024-07-25 Cross-Observatory Coordination with tilepy: A Novel Tool for Observations of Multi-Messenger Transient Events Monica Seglar-Arroyo et.al. 2407.18076 null
2024-07-25 AttentionHand: Text-driven Controllable Hand Image Generation for 3D Hand Reconstruction in the Wild Junho Park et.al. 2407.18034 link
2024-07-24 SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency Yiming Xie et.al. 2407.17470 null
2024-07-24 BlueTempNet: A Temporal Multi-network Dataset of Social Interactions in Bluesky Social Ujun Jeong et.al. 2407.17451 link
2024-07-24 ProvenanceWidgets: A Library of UI Control Elements to Track and Dynamically Overlay Analytic Provenance Arpit Narechania et.al. 2407.17431 link
2024-07-24 CDDIP: Constrained Diffusion-Driven Deep Image Prior for Seismic Image Reconstruction Paul Goyes-Peñafiel et.al. 2407.17402 link
2024-07-24 Cosmic ray susceptibility of the Terahertz Intensity Mapper detector arrays Lun-Jun Liu et.al. 2407.17381 null
2024-07-24 ViPer: Visual Personalization of Generative Models via Individual Preference Learning Sogand Salehi et.al. 2407.17365 null
2024-07-24 Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching Yuyang Ding et.al. 2407.17349 link
2024-07-24 Quantum nonlocal modulation cancellation with distributed clocks Stephen D. Chapman et.al. 2407.17330 null
2024-07-25 Enhanced Deep Learning Methodologies and MRI Selection Techniques for Dementia Diagnosis in the Elderly Population Nikolaos Ntampakis et.al. 2407.17324 null
2024-07-24 Edge-Cloud Continuum Orchestration of Critical Services: A Smart-City Approach Rodrigo Rosmaninho et.al. 2407.17314 null
2024-07-23 Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions Fabio Tosi et.al. 2407.16698 link
2024-07-23 From Imitation to Refinement -- Residual RL for Precise Visual Assembly Lars Ankile et.al. 2407.16677 null
2024-07-23 RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent Huiyu Xu et.al. 2407.16667 null
2024-07-23 MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence Canyu Zhao et.al. 2407.16655 null
2024-07-23 Unveiling and Mitigating Bias in Audio Visual Segmentation Peiwen Sun et.al. 2407.16638 null
2024-07-23 Knowledge-driven AI-generated data for accurate and interpretable breast ultrasound diagnoses Haojun Yu et.al. 2407.16634 null
2024-07-23 GenRec: A Flexible Data Generator for Recommendations Erica Coppolillo et.al. 2407.16594 null
2024-07-23 COALA: A Practical and Vision-Centric Federated Learning Platform Weiming Zhuang et.al. 2407.16560 link
2024-07-23 DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models Zhenyu Xie et.al. 2407.16511 null
2024-07-23 qMRI Diffusor: Quantitative T1 Mapping of the Brain using a Denoising Diffusion Probabilistic Model Shishuai Wang et.al. 2407.16477 null
2024-07-22 Artist: Aesthetically Controllable Text-Driven Stylization without Training Ruixiang Jiang et.al. 2407.15842 link
2024-07-23 A Large-scale Benchmark Dataset for Commuting Origin-destination Matrix Generation Can Rong et.al. 2407.15823 link
2024-07-22 Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget Vikash Sehwag et.al. 2407.15811 null
2024-07-22 Quantum Computing for Phonon Scattering Effects on Thermal Conductivity Xiangjun Tan et.al. 2407.15808 null
2024-07-22 Enhancing Mass Customization Manufacturing: Multiobjective Metaheuristic Algorithms for flow shop Production in Smart Industry Diego Rossit et.al. 2407.15802 null
2024-07-22 Diffusion Model Based Resource Allocation Strategy in Ultra-Reliable Wireless Networked Control Systems Amirhassan Babazadeh Darabi et.al. 2407.15784 null
2024-07-22 A Hamilton-Jacobi approach to road-field reaction-diffusion models Christopher Henderson et.al. 2407.15760 null
2024-07-22 Diffusion for Out-of-Distribution Detection on Road Scenes and Beyond Silvio Galesso et.al. 2407.15739 link
2024-07-22 DStruct2Design: Data and Benchmarks for Data Structure Driven Generative Floor Plan Design Zhi Hao Luo et.al. 2407.15723 link
2024-07-22 Estimating Probability Densities with Transformer and Denoising Diffusion Henry W. Leung et.al. 2407.15703 link
2024-07-19 DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks Sarah Jabbour et.al. 2407.14509 null
2024-07-19 On Pre-training of Multimodal Language Models Customized for Chart Understanding Wan-Cyuan Fan et.al. 2407.14506 null
2024-07-19 T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation Kaiyue Sun et.al. 2407.14505 link
2024-07-19 M2D2M: Multi-Motion Generation from Text with Discrete Diffusion Models Seunggeun Chi et.al. 2407.14502 null
2024-07-19 A Precision Cryogenic Positioning Stage for Detector Dithering and Flexure Compensation Stephen A. Smee et.al. 2407.14493 null
2024-07-19 Contrastive Learning with Counterfactual Explanations for Radiology Report Generation Mingjie Li et.al. 2407.14474 null
2024-07-19 Describe Data to get Science-Data-Ready Tooling: Awkward as a Target for Kaitai Struct YAML Manasvi Goyal et.al. 2407.14461 null
2024-07-19 Co-synthesis of Histopathology Nuclei Image-Label Pairs using a Context-Conditioned Joint Diffusion Model Seonghui Min et.al. 2407.14434 null
2024-07-19 Controllable and Efficient Multi-Class Pathology Nuclei Data Augmentation using Text-Conditioned Diffusion Models Hyun-Jic Oh et.al. 2407.14426 null
2024-07-19 GLAudio Listens to the Sound of the Graph Aurelio Sulser et.al. 2407.14387 link
2024-07-18 LogoSticker: Inserting Logos into Diffusion Models for Customized Generation Mingkang Zhu et.al. 2407.13752 null
2024-07-18 Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review Masatoshi Uehara et.al. 2407.13734 link
2024-07-18 Shaded Route Planning Using Active Segmentation and Identification of Satellite Images Longchao Da et.al. 2407.13689 null
2024-07-18 PASTA: Controllable Part-Aware Shape Generation with Autoregressive Transformers Songlin Li et.al. 2407.13677 link
2024-07-18 MeshSegmenter: Zero-Shot Mesh Semantic Segmentation via Texture Synthesis Ziming Zhong et.al. 2407.13675 link
2024-07-18 Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models Xiaoyu Zhu et.al. 2407.13642 null
2024-07-18 Training-free Composite Scene Generation for Layout-to-Image Synthesis Jiaqi Liu et.al. 2407.13609 link
2024-07-18 EnergyDiff: Universal Time-Series Energy Data Generation using Diffusion Models Nan Lin et.al. 2407.13538 null
2024-07-18 VeriQR: A Robustness Verification Tool for Quantum Machine Learning Models Yanling Lin et.al. 2407.13533 null
2024-07-18 All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models Charumathi Badrinath et.al. 2407.13449 link
2024-07-17 SMooDi: Stylized Motion Diffusion Model Lei Zhong et.al. 2407.12783 null
2024-07-17 VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control Sherwin Bahmani et.al. 2407.12781 null
2024-07-17 Hallucination Index: An Image Quality Metric for Generative Reconstruction Models Matthew Tivnan et.al. 2407.12780 null
2024-07-17 GroundUp: Rapid Sketch-Based 3D City Massing Gizem Esra Unlu et.al. 2407.12739 null
2024-07-17 EchoSight: Advancing Visual-Language Models with Wiki Knowledge Yibin Yan et.al. 2407.12735 null
2024-07-17 NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model Zhongqun Zhang et.al. 2407.12727 null
2024-07-17 An Evaluation of Continual Learning for Advanced Node Semiconductor Defect Inspection Amit Prasad et.al. 2407.12724 null
2024-07-17 Unlocking planetesimal magnetic field histories: a refined, versatile model for thermal evolution and dynamo generation Hannah R. Sanderson et.al. 2407.12721 null
2024-07-17 SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow Yuanzhi Zhu et.al. 2407.12718 link
2024-07-17 Teleoperation in Robot-assisted MIS with Adaptive RCM via Admittance Control Ehsan Nasiri et.al. 2407.12711 null
2024-07-16 Efficient Training with Denoised Neural Weights Yifan Gong et.al. 2407.11966 null
2024-07-16 UrbanWorld: An Urban World Model for 3D City Generation Yu Shang et.al. 2407.11965 link
2024-07-16 Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design Leo Klarner et.al. 2407.11942 link
2024-07-16 Code Documentation and Analysis to Secure Software Development Paul Attie et.al. 2407.11934 null
2024-07-16 Global Optimisation of Black-Box Functions with Generative Models in the Wasserstein Space Tigran Ramazyan et.al. 2407.11917 link
2024-07-16 Quantised Global Autoencoder: A Holistic Approach to Representing Visual Data Tim Elsner et.al. 2407.11913 null
2024-07-16 Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development Daoyuan Chen et.al. 2407.11784 link
2024-07-16 Diffusion-driven self-assembly of emerin nanodomains at the nuclear envelope Carlos D. Alas et.al. 2407.11758 null
2024-07-16 Generating Multi-Modal and Multi-Attribute Single-Cell Counts with CFGen Alessandro Palma et.al. 2407.11734 link
2024-07-16 Theoretical Insights into CycleGAN: Analyzing Approximation and Estimation Errors in Unpaired Data Generation Luwei Sun et.al. 2407.11678 null
2024-07-15 Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion Yongyuan Liang et.al. 2407.10973 null
2024-07-15 Fast Matrix Multiplications for Lookup Table-Quantized LLMs Han Guo et.al. 2407.10960 link
2024-07-15 InVi: Object Insertion In Videos Using Off-the-Shelf Diffusion Models Nirat Saini et.al. 2407.10958 null
2024-07-16 DataDream: Few-shot Guided Dataset Generation Jae Myung Kim et.al. 2407.10910 link
2024-07-15 Optical Diffusion Models for Image Generation Ilker Oguz et.al. 2407.10897 null
2024-07-15 R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection Zheyuan Zhou et.al. 2407.10862 null
2024-07-15 Physics-Inspired Generative Models in Medical Imaging: A Review Dennis Hein et.al. 2407.10856 null
2024-07-15 Inferring dark energy properties from the scale factor parametrisation Upala Mukhopadhayay et.al. 2407.10845 null
2024-07-15 MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration Yulin Ren et.al. 2407.10833 null
2024-07-15 Foundational Autoraters: Taming Large Language Models for Better Automatic Evaluation Tu Vu et.al. 2407.10817 null
2024-07-12 StyleSplat: 3D Object Style Transfer with Gaussian Splatting Sahil Jain et.al. 2407.09473 null
2024-07-12 FairyLandAI: Personalized Fairy Tales utilizing ChatGPT and DALLE-3 Georgios Makridis et.al. 2407.09467 null
2024-07-12 The $μ\mathcal{G}$ Language for Programming Graph Neural Networks Matteo Belenchia et.al. 2407.09441 null
2024-07-12 Graph Neural Network Causal Explanation via Neural Causal Models Arman Behnam et.al. 2407.09378 link
2024-07-12 Computationally Efficient Estimation of Large Probit Models Patrick Ding et.al. 2407.09371 null
2024-07-12 Is Contrasting All You Need? Contrastive Learning for the Detection and Attribution of AI-generated Text Lucio La Cava et.al. 2407.09364 null
2024-07-15 Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees Alexia Jolicoeur-Martineau et.al. 2407.09357 link
2024-07-12 PID: Physics-Informed Diffusion Model for Infrared Image Generation Fangyuan Mao et.al. 2407.09299 link
2024-07-12 Learning Distances from Data with Normalizing Flows and Score Matching Peter Sorrenson et.al. 2407.09297 null
2024-07-12 Surgical Text-to-Image Generation Chinedu Innocent Nwoye et.al. 2407.09230 null
2024-07-11 Video Diffusion Alignment via Reward Gradients Mihir Prabhudesai et.al. 2407.08737 link
2024-07-11 Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models Zhening Xing et.al. 2407.08701 null
2024-07-11 FAR-Trans: An Investment Dataset for Financial Asset Recommendation Javier Sanz-Cruzado et.al. 2407.08692 null
2024-07-11 Scattering transforms on the sphere, application to large scale structure modelling Louise Mousset et.al. 2407.08687 null
2024-07-11 CAD-Prompted Generative Models: A Pathway to Feasible and Novel Engineering Designs Leah Chong et.al. 2407.08675 null
2024-07-11 Still-Moving: Customized Video Generation without Customized Video Data Hila Chefer et.al. 2407.08674 null
2024-07-11 Controlling the Fidelity and Diversity of Deep Generative Models via Pseudo Density Shuangqi Li et.al. 2407.08659 null
2024-07-11 Adaptive Smooth Non-Stationary Bandits Joe Suk et.al. 2407.08654 null
2024-07-11 Fine-Tuning Stable Diffusion XL for Stylistic Icon Generation: A Comparison of Caption Size Youssef Sultan et.al. 2407.08513 null
2024-07-11 Latent Conditional Diffusion-based Data Augmentation for Continuous-Time Dynamic Graph Mode Yuxing Tian et.al. 2407.08500 null
2024-07-10 Generative Image as Action Models Mohit Shridhar et.al. 2407.07875 link
2024-07-10 Dynamical Measure Transport and Neural PDE Solvers for Sampling Jingtong Sun et.al. 2407.07873 null
2024-07-10 Controlling Space and Time with Diffusion Models Daniel Watson et.al. 2407.07860 null
2024-07-10 Generic Numerical Analysis of Stochastic Reaction Diffusion Model with applications in excitable media Yahya Alnashri et.al. 2407.07834 null
2024-07-10 Universal and non-universal signatures in the scaling functions of critical variables Gianluca Teza et.al. 2407.07782 null
2024-07-10 Towards Human-Like Driving: Active Inference in Autonomous Vehicle Control Elahe Delavari et.al. 2407.07684 null
2024-07-10 VEnhancer: Generative Space-Time Enhancement for Video Generation Jingwen He et.al. 2407.07667 null
2024-07-10 A Coding-Theoretic Analysis of Hyperspherical Prototypical Learning Geometry Martin Lindström et.al. 2407.07664 link
2024-07-10 The heterogeneous impact of the EU-Canada agreement with causal machine Lionel Fontagné et.al. 2407.07652 null
2024-07-11 MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis Wanggui He et.al. 2407.07614 link
2024-07-09 ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction Shaozhe Hao et.al. 2407.07077 link
2024-07-09 Latent Space Imaging Matheus Souza et.al. 2407.07052 null
2024-07-09 Generative models of astrophysical fields with scattering transforms on the sphere Louise Mousset et.al. 2407.07007 link
2024-07-10 PEER: Expertizing Domain-Specific Tasks with a Multi-Agent Framework and Tuning Methods Yiying Wang et.al. 2407.06985 link
2024-07-09 Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach Taolin Zhang et.al. 2407.06964 null
2024-07-09 RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models Bowen Zhang et.al. 2407.06938 null
2024-07-09 HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance Guian Fang et.al. 2407.06937 link
2024-07-09 Fine-grained large-scale content recommendations for MSX sellers Manpreet Singh et.al. 2407.06910 null
2024-07-09 Enhanced Battery Degradation-Aware Scheduling for Distribution Network with Electric Vehicle Load Vijay Babu Pamshetti et.al. 2407.06857 null
2024-07-09 A reaction-diffusion model for relapsing-remitting multiple sclerosis with a treatment term Romina Travaglini et.al. 2407.06802 null
2024-07-08 Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images Zhangyang Qi et.al. 2407.06191 null
2024-07-08 CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation Xinying Guo et.al. 2407.06188 null
2024-07-08 JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation Yu Zeng et.al. 2407.06187 null
2024-07-08 The Tug-of-War Between Deepfake Generation and Detection Hannah Lee et.al. 2407.06174 null
2024-07-08 ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation Ethan Chern et.al. 2407.06135 link
2024-07-08 Structured Generations: Using Hierarchical Clusters to guide Diffusion Models Jorge da Silva Goncalves et.al. 2407.06124 link
2024-07-08 PerlDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models Jinhua Zhang et.al. 2407.06109 link
2024-07-08 Accelerating Diffusion for SAR-to-Optical Image Translation via Adversarial Consistency Distillation Xinyu Bai et.al. 2407.06095 null
2024-07-08 Assessing Cardiomegaly in Dogs Using a Simple CNN Model Nikhil Deekonda et.al. 2407.06092 null
2024-07-08 Layered Diffusion Model for One-Shot High Resolution Text-to-Image Synthesis Emaad Khwaja et.al. 2407.06079 null
2024-07-05 RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation Yuxuan Kuang et.al. 2407.04689 link
2024-07-05 Thermal and mechanical study of a parametrised cryostat model for optical characterisation of upcoming CMB experiments Thomas J. L. J. Gascard et.al. 2407.04613 link
2024-07-08 PartCraft: Crafting Creative Objects by Parts Kam Woh Ng et.al. 2407.04604 link
2024-07-05 Structural Constraint Integration in Generative Model for Discovery of Quantum Material Candidates Ryotaro Okabe et.al. 2407.04557 null
2024-07-05 Unified continuous-time q-learning for mean-field game and mean-field control problems Xiaoli Wei et.al. 2407.04521 null
2024-07-08 Speed-accuracy trade-off for the diffusion models: Wisdom from nonequilibrium thermodynamics and optimal transport Kotaro Ikeda et.al. 2407.04495 null
2024-07-05 PROUD: PaRetO-gUided Diffusion Model for Multi-objective Generation Yinghua Yao et.al. 2407.04493 link
2024-07-05 Dude: Dual Distribution-Aware Context Prompt Learning For Large Vision-Language Model Duy M. H. Nguyen et.al. 2407.04489 null
2024-07-05 Leveraging Graph Structures to Detect Hallucinations in Large Language Models Noa Nonkes et.al. 2407.04485 link
2024-07-05 VCD-Texture: Variance Alignment based 3D-2D Co-Denoising for Text-Guided Texturing Shang Liu et.al. 2407.04461 null
2024-07-03 DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents Yilun Xu et.al. 2407.03300 link
2024-07-03 Improved Noise Schedule for Diffusion Training Tiankai Hang et.al. 2407.03297 null
2024-07-03 Anomaly-based Framework for Detecting Power Overloading Cyberattacks in Smart Grid AMI Abdelaziz Amara Korba et.al. 2407.03264 null
2024-07-03 SOS! Soft Prompt Attack Against Open-Source Large Language Models Ziqing Yang et.al. 2407.03160 null
2024-07-04 Spatio-Temporal Adaptive Diffusion Models for EEG Super-Resolution in Epilepsy Diagnosis Tong Zhou et.al. 2407.03089 null
2024-07-03 Artificial Inductive Bias for Synthetic Tabular Data Generation in Data-Scarce Scenarios Patricia A. Apellániz et.al. 2407.03080 link
2024-07-03 Electromagnetic Property Sensing Based on Diffusion Model in ISAC System Yuhua Jiang et.al. 2407.03075 null
2024-07-03 Semantic-Aware Power Allocation for Generative Semantic Communications with Foundation Models Chunmei Xu et.al. 2407.03050 null
2024-07-03 SlerpFace: Face Template Protection via Spherical Linear Interpolation Zhizhou Zhong et.al. 2407.03043 null
2024-07-03 An Organism Starts with a Single Pix-Cell: A Neural Cellular Diffusion for High-Resolution Image Synthesis Marawan Elbatel et.al. 2407.03018 link
2024-07-02 Magic Insert: Style-Aware Drag-and-Drop Nataniel Ruiz et.al. 2407.02489 null
2024-07-02 Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models Fei Shen et.al. 2407.02482 link
2024-07-02 A Pattern Language for Machine Learning Tasks Benjamin Rodatz et.al. 2407.02424 null
2024-07-02 GCF: Graph Convolutional Networks for Facial Expression Recognition Hozaifa Kassab et.al. 2407.02361 null
2024-07-02 MORPHEUS: Modeling Role from Personalized Dialogue History by Exploring and Utilizing Latent Space Yihong Tang et.al. 2407.02345 null
2024-07-02 Choice-based time slot management in attended home delivery Dorsa Abdolhamidi et.al. 2407.02339 null
2024-07-02 Mining Constraints from Reference Process Models for Detecting Best-Practice Violations in Event Log Adrian Rebmann et.al. 2407.02336 link
2024-07-02 A tactical time slot management problem under mixed logit demand Dorsa Abdolhamidi et.al. 2407.02308 null
2024-07-02 Renard: A Modular Pipeline for Extracting Character Networks from Narrative Texts Arthur Amalvy et.al. 2407.02284 link
2024-07-03 Federated Distillation for Medical Image Classification: Towards Trustworthy Computer-Aided Diagnosis Sufen Ren et.al. 2407.02261 null
2024-06-28 Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language Yicheng Chen et.al. 2406.20085 null
2024-06-28 The hybrid Josephson rhombus: A superconducting element with tailored current-phase relation L. Banszerus et.al. 2406.20082 null
2024-06-28 HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model Hieu T. Nguyen et.al. 2406.20077 null
2024-06-28 Modeling and LQR Control of Insect Sized Flapping Wing Robot Daksh Dhingra et.al. 2406.20061 null
2024-06-28 Neural Differentiable Modeling with Diffusion-Based Super-resolution for Two-Dimensional Spatiotemporal Turbulence Xiantao Fan et.al. 2406.20047 null
2024-06-28 Electrostatics-based particle sampling and approximate inference Yongchao Huang et.al. 2406.20044 link
2024-06-28 HAITCH: A Framework for Distortion and Motion Correction in Fetal Multi-Shell Diffusion-Weighted MRI Haykel Snoussi et.al. 2406.20042 null
2024-06-28 Concept Lens: Visually Analyzing the Consistency of Semantic Manipulation in GANs Sangwon Jeong et.al. 2406.19987 null
2024-07-01 Text2Robot: Evolutionary Robot Design from Text Descriptions Ryan P. Ringel et.al. 2406.19963 link
2024-06-28 Kolmogorov-Smirnov GAN Maciej Falkiewicz et.al. 2406.19948 link
2024-06-27 Looking 3D: Anomaly Detection with 2D-3D Alignment Ankan Bhunia et.al. 2406.19393 link
2024-06-27 Taming Data and Transformers for Audio Generation Moayed Haji-Ali et.al. 2406.19388 null
2024-06-27 Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space Core Francisco Park et.al. 2406.19370 link
2024-06-27 Accelerating Multiphase Flow Simulations with Denoising Diffusion Model Driven Initializations Jaehong Chung et.al. 2406.19333 null
2024-06-27 Subtractive Training for Music Stem Insertion using Latent Diffusion Models Ivan Villa-Renteria et.al. 2406.19328 null
2024-06-27 Efficient World Models with Context-Aware Tokenization Vincent Micheli et.al. 2406.19320 link
2024-06-27 PNeRV: A Polynomial Neural Representation for Videos Sonam Gupta et.al. 2406.19299 null
2024-06-27 Compositional Image Decomposition with Diffusion Models Jocelin Su et.al. 2406.19298 null
2024-06-27 BISeizuRe: BERT-Inspired Seizure Data Representation to Improve Epilepsy Monitoring Luca Benfenati et.al. 2406.19189 null
2024-06-27 On Pólya-Young urn models and growth processes Markus Kuba et.al. 2406.19110 null
2024-06-26 MatchTime: Towards Automatic Soccer Game Commentary Generation Jiayuan Rao et.al. 2406.18530 link
2024-06-26 MultiDiff: Consistent Novel View Synthesis from a Single Image Norman Müller et.al. 2406.18524 null
2024-06-26 Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration Kang Liao et.al. 2406.18516 link
2024-06-26 DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance Younghyun Kim et.al. 2406.18459 link
2024-06-26 Cascading Large Language Models for Salient Event Graph Generation Xingwei Tan et.al. 2406.18449 link
2024-06-26 Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling Abril Corona-Figueroa et.al. 2406.18422 link
2024-06-26 Towards diffusion models for large-scale sea-ice modelling Tobias Sebastian Finn et.al. 2406.18417 null
2024-06-27 Stable Diffusion Segmentation for Biomedical Images with Single-step Reverse Process Tianyu Lin et.al. 2406.18361 link
2024-06-26 Molecular Diffusion Models with Virtual Receptors Matan Halfon et.al. 2406.18330 null
2024-06-27 Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems Italo Luis da Silva et.al. 2406.18245 link
2024-06-25 DiffusionPDE: Generative PDE-Solving Under Partial Observation Jiahe Huang et.al. 2406.17763 link
2024-06-25 MotionBooth: Motion-Aware Customized Text-to-Video Generation Jianzong Wu et.al. 2406.17758 null
2024-06-25 Accelerating Clinical Evidence Synthesis with Large Language Models Zifeng Wang et.al. 2406.17755 null
2024-06-25 Extensions of Panjer's recursion for mixed compound distributions Spyridon M. Tzaninis et.al. 2406.17726 null
2024-06-25 PANDA: A self-driving lab for studying electrodeposited polymer films Harley Quinn et.al. 2406.17725 null
2024-06-25 Unified Auto-Encoding with Masked Diffusion Philippe Hansen-Estruch et.al. 2406.17688 link
2024-06-25 LaTable: Towards Large Tabular Models Boris van Breugel et.al. 2406.17673 null
2024-06-26 SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond Marco Comunità et.al. 2406.17672 null
2024-06-25 Banishing LLM Hallucinations Requires Rethinking Generalization Johnny Li et.al. 2406.17642 null
2024-06-25 The experience of humans' and robots' mutual (im)politeness in enacted service scenarios: An empirical study Victor Kaptelinin et.al. 2406.17641 null
2024-06-24 FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models Haonan Qiu et.al. 2406.16863 link
2024-06-24 Dreamitate: Real-World Visuomotor Policy Learning via Video Generation Junbang Liang et.al. 2406.16862 null
2024-06-24 DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Yuang Peng et.al. 2406.16855 link
2024-06-24 USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$ onversations Mounika Marreddy et.al. 2406.16833 null
2024-06-24 General Binding Affinity Guidance for Diffusion Models in Structure-Based Drug Design Yue Jian et.al. 2406.16821 null
2024-06-24 ClotheDreamer: Text-Guided Garment Generation with 3D Gaussians Yufei Liu et.al. 2406.16815 null
2024-06-24 Conformal time series decomposition with component-wise exchangeability Derck W. E. Prinzhorn et.al. 2406.16766 link
2024-06-24 Inferring stochastic low-rank recurrent neural networks from neural data Matthijs Pals et.al. 2406.16749 link
2024-06-24 Portrait3D: 3D Head Generation from Single In-the-wild Portrait Image Jinkun Hao et.al. 2406.16710 null
2024-06-24 Geometry-Aware Score Distillation via 3D Consistent Noising and Gradient Consistency Modeling Min-Seop Kwak et.al. 2406.16695 null
2024-06-21 Masked Extended Attention for Zero-Shot Virtual Try-On In The Wild Nadav Orzech et.al. 2406.15331 null
2024-06-21 Rethinking Remote Sensing Change Detection With A Mask View Xiaowen Ma et.al. 2406.15320 link
2024-06-21 You Only Acquire Sparse-channel (YOAS): A Unified Framework for Dense-channel EEG Generation Hongyu Chen et.al. 2406.15269 null
2024-06-21 Evaluating Diversity in Automatic Poetry Generation Yanran Chen et.al. 2406.15267 link
2024-06-21 Fingerprint Membership and Identity Inference Against Generative Adversarial Networks Saverio Cavasin et.al. 2406.15253 null
2024-06-21 MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation Xuan He et.al. 2406.15252 null
2024-06-21 Unsupervised Bayesian Generation of Synthetic CT from CBCT Using Patient-Specific Score-Based Prior Junbo Peng et.al. 2406.15219 null
2024-06-21 Sound and Fury, Signifying Nothing? Impact of Data Breach Disclosure Laws Muhammad Zia Hydari et.al. 2406.15215 null
2024-06-21 Injecting Bias in Text-To-Image Models via Composite-Trigger Backdoors Ali Naseh et.al. 2406.15213 link
2024-06-21 Exploring the Efficacy of Robotic Assistants with ChatGPT and Claude in Enhancing ADHD Therapy: Innovating Treatment Paradigms Santiago Berrezueta-Guzman et.al. 2406.15198 null
2024-06-20 A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models Xincheng Shuai et.al. 2406.14555 link
2024-06-21 Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation Eyal Michaeli et.al. 2406.14551 link
2024-06-20 Consistency Models Made Easy Zhengyang Geng et.al. 2406.14548 link
2024-06-20 IRASim: Learning Interactive Real-Robot Action Simulators Fangqi Zhu et.al. 2406.14540 null
2024-06-20 Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps Nikita Starodubcev et.al. 2406.14539 null
2024-06-20 Fantastic Copyrighted Beasts and How (Not) to Generate Them Luxi He et.al. 2406.14526 null
2024-06-20 Photoacoustic methane detection assisted by a gas-filled anti-resonant hollow-core fiber laser Cuiling Zhang et.al. 2406.14521 null
2024-06-20 V-LASIK: Consistent Glasses-Removal from Videos Using Synthetic Data Rotem Shalev-Arkushin et.al. 2406.14510 null
2024-06-20 CodeRAG-Bench: Can Retrieval Augment Code Generation? Zora Zhiruo Wang et.al. 2406.14497 link
2024-06-20 SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset Josef Dai et.al. 2406.14477 link
2024-06-20 CollaFuse: Collaborative Diffusion Models Simeon Allmendinger et.al. 2406.14429 link
2024-06-20 Active Diffusion Subsampling Oisin Nolan et.al. 2406.14388 link
2024-06-20 Multicoloured Hardcore Model: Fast Mixing and Queueing Sam Olesker-Taylor et.al. 2406.14376 null
2024-06-20 FairX: A comprehensive benchmarking tool for model analysis using fairness, utility, and explainability Md Fahim Sikder et.al. 2406.14281 link
2024-06-20 In Tree Structure Should Sentence Be Generated Yaguang Li et.al. 2406.14189 link
2024-06-20 CriDiff: Criss-cross Injection Diffusion Framework via Generative Pre-train for Prostate Segmentation Tingwei Liu et.al. 2406.14186 link
2024-06-20 Tractable Equilibrium Computation in Markov Games through Risk Aversion Eric Mazumdar et.al. 2406.14156 null
2024-06-20 ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning Zhongjie Duan et.al. 2406.14130 link
2024-06-20 Dye4AI: Assuring Data Boundary on Generative AI Services Shu Wang et.al. 2406.14114 null
2024-06-20 HeartBeat: Towards Controllable Echocardiography Video Synthesis with Multimodal Conditions-Guided Diffusion Models Xinrui Zhou et.al. 2406.14098 null
2024-06-20 Bridging bulk and surface: An interacting particle system towards the field-road diffusion model Matthieu Alfaro et.al. 2406.14093 null
2024-06-20 A Practical Diffusion Path for Sampling Omar Chehab et.al. 2406.14040 null
2024-06-20 Leveraging eBPF and AI for Ransomware Nose Out Arjun Sekar et.al. 2406.14020 null
2024-06-20 Feature Fusion Based on Mutual-Cross-Attention Mechanism for EEG Emotion Recognition Yimin Zhao et.al. 2406.14014 link
2024-06-20 Exploring Changes in Nation Perception with Nationality-Assigned Personas in LLMs Mahammed Kamruzzaman et.al. 2406.13993 null
2024-06-20 The Elusive Pursuit of Replicating PATE-GAN: Benchmarking, Auditing, Debugging Georgi Ganev et.al. 2406.13985 link
2024-06-20 Similarity-aware Syncretic Latent Diffusion Model for Medical Image Translation with Representation Learning Tingyi Lin et.al. 2406.13977 null
2024-06-20 Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models Yuan Zhong et.al. 2406.13942 null
2024-06-20 EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations Jie Ren et.al. 2406.13933 null
2024-06-20 Generative AI for Enhancing Active Learning in Education: A Comparative Study of GPT-3.5 and GPT-4 in Crafting Customized Test Questions Hamdireza Rouzegar et.al. 2406.13903 null
2024-06-19 INFusion: Diffusion Regularized Implicit Neural Representations for 2D and 3D accelerated MRI reconstruction Yamin Arefeen et.al. 2406.13895 null
2024-06-19 Open Generative Large Language Models for Galician Pablo Gamallo et.al. 2406.13893 null
2024-06-19 StackRAG Agent: Improving Developer Answers with Retrieval-Augmented Generation Davit Abrahamyan et.al. 2406.13840 link
2024-06-19 RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design Rishabh Anand et.al. 2406.13839 link
2024-06-19 COAC: Cross-layer Optimization of Accelerator Configurability for Efficient CNN Processing Steven Colleman et.al. 2406.13752 null
2024-06-19 GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation Baiqi Li et.al. 2406.13743 link
2024-06-19 Tree-Sliced Wasserstein Distance on a System of Lines Viet-Hoang Tran et.al. 2406.13725 null
2024-06-19 Hitchhiker's guide on Energy-Based Models: a comprehensive review on the relation with other generative models, sampling and statistical physics Davide Carbone et.al. 2406.13661 null
2024-06-19 Towards Minimal Targeted Updates of Language Models with Targeted Negative Training Lily H. Zhang et.al. 2406.13660 link
2024-06-19 Stability and Generalizability in SDE Diffusion Models with Measure-Preserving Dynamics Weitong Zhang et.al. 2406.13652 null
2024-06-19 On AI-Inspired UI-Design Jialiang Wei et.al. 2406.13631 null
2024-06-19 Can AI be enabled to dynamical downscaling? Training a Latent Diffusion Model to mimic km-scale COSMO-CLM downscaling of ERA5 over Italy Elena Tomasi et.al. 2406.13627 link
2024-06-19 Enhance the Image: Super Resolution using Artificial Intelligence in MRI Ziyu Li et.al. 2406.13625 null
2024-06-19 Generative Modeling by Minimizing the Wasserstein-2 Loss Yu-Jui Huang et.al. 2406.13619 null
2024-06-19 Parameter Training Efficiency Aware Resource Allocation for AIGC in Space-Air-Ground Integrated Networks Liangxin Qian et.al. 2406.13602 null
2024-06-19 ModSec-Learn: Boosting ModSecurity with Machine Learning Christian Scano et.al. 2406.13547 link
2024-06-19 Towards Cyber Threat Intelligence for the IoT Alfonso Iacovazzi et.al. 2406.13543 null
2024-06-19 Image Distillation for Safe Data Sharing in Histopathology Zhe Li et.al. 2406.13536 link
2024-06-19 Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement Chenda Li et.al. 2406.13471 null
2024-06-19 Unifying nonlinearly constrained nonconvex optimization Charlie Vanaret et.al. 2406.13454 link
2024-06-19 Federating to Grow Transformers with Constrained Resources without Model Sharing Shikun Shen et.al. 2406.13450 null
2024-06-19 Multi-messenger modeling of the Monogem pulsar halo Youyou Li et.al. 2406.13426 null
2024-06-19 Style-NeRF2NeRF: 3D Style Transfer From Style-Aligned Multi-View Images Haruo Fujiwara et.al. 2406.13393 null
2024-06-19 Effective Edge-wise Representation Learning in Edge-Attributed Bipartite Graphs Hewen Wang et.al. 2406.13369 null
2024-06-19 Situational Instructions Database: Task Guidance in Dynamic Environments Muhammad Saif Ullah Khan et.al. 2406.13302 link
2024-06-19 ARDuP: Active Region Video Diffusion for Universal Policies Shuaiyi Huang et.al. 2406.13301 null
2024-06-19 AniFaceDiff: High-Fidelity Face Reenactment via Facial Parametric Conditioned Diffusion Models Ken Chen et.al. 2406.13272 null
2024-06-19 Self-Supervised Diffusion Model for 3-D Seismic Data Reconstruction Xinyang Wang et.al. 2406.13252 null
2024-06-19 Optimizing Inventory Management through Multiobjective Reverse Logistics with Environmental Impact I. B. Wadhawan et.al. 2406.13226 null
2024-06-19 Neural Residual Diffusion Models for Deep Scalable Vision Generation Zhiyuan Ma et.al. 2406.13215 null
2024-06-19 Surgical Triplet Recognition via Diffusion Model Daochang Liu et.al. 2406.13210 null
2024-06-19 Diffusion Model-based FOD Restoration from High Distortion in dMRI Shuo Huang et.al. 2406.13209 null
2024-06-19 Toward Structure Fairness in Dynamic Graph Embedding: A Trend-aware Dual Debiasing Approach Yicong Li et.al. 2406.13201 link
2024-06-19 Synthetic Context Generation for Question Generation Naiming Liu et.al. 2406.13188 null
2024-06-19 Conditional score-based diffusion models for solving inverse problems in mechanics Agnimitra Dasgupta et.al. 2406.13154 null
2024-06-19 von Mises Quasi-Processes for Bayesian Circular Regression Yarden Cohen et.al. 2406.13151 null
2024-06-19 MCAD: Multi-modal Conditioned Adversarial Diffusion Model for High-Quality PET Image Reconstruction Jiaqi Cui et.al. 2406.13150 null
2024-06-19 GVT2RPM: An Empirical Study for General Video Transformer Adaptation to Remote Physiological Measurement Hao Wang et.al. 2406.13136 null
2024-06-19 Thruster-Assisted Incline Walking Kaushik Venkatesh Krishnamurthy et.al. 2406.13118 null
2024-06-18 Sampling 3D Gaussian Scenes in Seconds with Latent Diffusion Models Paul Henderson et.al. 2406.13099 null
2024-06-18 RITA: A Real-time Interactive Talking Avatars Framework Wuxinlin Cheng et.al. 2406.13093 null
2024-06-18 PIPPIN: Generating variable length full events from partons Guillaume Quétant et.al. 2406.13074 link
2024-06-18 MaskPure: Improving Defense Against Text Adversaries with Stochastic Purification Harrison Gietz et.al. 2406.13066 link
2024-06-18 Traffic Prediction considering Multiple Levels of Spatial-temporal Information: A Multi-scale Graph Wavelet-based Approach Zilin Bian et.al. 2406.13038 null
2024-06-18 Sharp detection of low-dimensional structure in probability measures via dimensional logarithmic Sobolev inequalities Matthew T. C. Li et.al. 2406.13036 null
2024-06-18 Data Plagiarism Index: Characterizing the Privacy Risk of Data-Copying in Tabular Generative Models Joshua Ward et.al. 2406.13012 null
2024-06-18 Synergizing Foundation Models and Federated Learning: A Survey Shenghui Li et.al. 2406.12844 null
2024-06-18 Evaluating the design space of diffusion-based generative models Yuqing Wang et.al. 2406.12839 null
2024-06-18 Neural Approximate Mirror Maps for Constrained Diffusion Models Berthy T. Feng et.al. 2406.12816 null
2024-06-19 AITTI: Learning Adaptive Inclusive Token for Text-to-Image Generation Xinyu Hou et.al. 2406.12805 link
2024-06-18 Extracting Training Data from Unconditional Diffusion Models Yunhao Chen et.al. 2406.12752 null
2024-06-18 Useful stochastic bounds in time-varying queues with service and patience times having general joint distribution Shreehari Anand Bodas et.al. 2406.12745 null
2024-06-18 SUPER: Selfie Undistortion and Head Pose Editing with Identity Preservation Polina Karpikova et.al. 2406.12700 null
2024-06-18 Speak in the Scene: Diffusion-based Acoustic Scene Transfer toward Immersive Speech Generation Miseul Kim et.al. 2406.12688 null
2024-06-18 GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models Yongtao Ge et.al. 2406.12671 link
2024-06-18 Research and Implementation of Data Enhancement Techniques for Graph Neural Networks Jingzhao Gu et.al. 2406.12640 null
2024-06-18 News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation Andreea Iana et.al. 2406.12634 link
2024-06-18 Learning Diffusion at Lightspeed Antonio Terpin et.al. 2406.12616 null
2024-06-18 Unmasking the Veil: An Investigation into Concept Ablation for Privacy and Copyright Protection in Images Shivank Garg et.al. 2406.12592 link
2024-06-18 Behavior-Dependent Linear Recurrent Units for Efficient Sequential Recommendation Chengkai Liu et.al. 2406.12580 link
2024-06-18 Training Diffusion Models with Federated Learning Matthijs de Goede et.al. 2406.12575 null
2024-06-18 P-Tailor: Customizing Personality Traits for Language Models via Mixture of Specialized LoRA Experts Yuhao Dan et.al. 2406.12548 null
2024-06-18 Structured Detection for Simultaneous Super-Resolution and Optical Sectioning in Laser Scanning Microscopy Alessandro Zunino et.al. 2406.12542 link
2024-06-18 Variational Distillation of Diffusion Policies into Mixture of Experts Hongyi Zhou et.al. 2406.12538 null
2024-06-18 HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors Panwang Pan et.al. 2406.12459 link
2024-06-18 Planning Using Schrödinger Bridge Diffusion Models Adarsh Srivastava et.al. 2406.12458 link
2024-06-18 Deep Temporal Deaggregation: Large-Scale Spatio-Temporal Generative Models David Bergström et.al. 2406.12423 null
2024-06-18 ROVER: RTL Optimization via Verified E-Graph Rewriting Samuel Coward et.al. 2406.12421 null
2024-06-18 TADM: Temporally-Aware Diffusion Model for Neurodegenerative Progression on Brain MRI Mattia Litrico et.al. 2406.12411 null
2024-06-18 SDNIA-YOLO: A Robust Object Detection Model for Extreme Weather Conditions Yuexiong Ding et.al. 2406.12395 null

(back to top)

Vision-Language Models

Publish Date Title Authors PDF Code
2024-12-19 OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving Shuo Xing et.al. 2412.15208 null
2024-12-19 LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation Weijia Shi et.al. 2412.15188 null
2024-12-19 Qwen2.5 Technical Report Qwen et.al. 2412.15115 null
2024-12-19 Progressive Multimodal Reasoning via Active Retrieval Guanting Dong et.al. 2412.14835 null
2024-12-19 Explainable Tampered Text Detection via Multimodal Large Models Chenfan Qu et.al. 2412.14816 null
2024-12-18 Descriptive Caption Enhancement with Visual Specialists for Multimodal Perception Yanpeng Sun et.al. 2412.14233 link
2024-12-18 AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities Guillaume Astruc et.al. 2412.14123 link
2024-12-19 G-VEval: A Versatile Metric for Evaluating Image and Video Captions Using GPT-4o Tony Cheng Tong et.al. 2412.13647 link
2024-12-18 Detecting Machine-Generated Music with Explainability -- A Challenge and Early Benchmarks Yupei Li et.al. 2412.13421 null
2024-12-17 DoPTA: Improving Document Layout Analysis using Patch-Text Alignment Nikitha SR et.al. 2412.12902 null
2024-12-17 Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models YiFan Zhang et.al. 2412.12606 null
2024-12-17 PBVS 2024 Solution: Self-Supervised Learning and Sampling Strategies for SAR Classification in Extreme Long-Tail Distribution Yuhyun Kim et.al. 2412.12565 null
2024-12-17 Causal Diffusion Transformers for Generative Modeling Chaorui Deng et.al. 2412.12095 link
2024-12-16 CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology Yuxuan Sun et.al. 2412.12077 null
2024-12-16 Gramian Multimodal Representation Learning and Alignment Giordano Cicchetti et.al. 2412.11959 null
2024-12-16 LMM-Regularized CLIP Embeddings for Image Classification Maria Tzelepi et.al. 2412.11663 null
2024-12-15 Seeing the Forest and the Trees: Solving Visual Graph and Tree Based Data Structure Problems using Large Multimodal Models Sebastian Gutierrez et.al. 2412.11088 null
2024-12-13 Apollo: An Exploration of Video Understanding in Large Multimodal Models Orr Zohar et.al. 2412.10360 null
2024-12-13 Performance of ChatGPT on tasks involving physics visual representations: the case of the Brief Electricity and Magnetism Assessment Giulia Polverini et.al. 2412.10019 null
2024-12-12 Vision-Language Models Represent Darker-Skinned Black Individuals as More Homogeneous than Lighter-Skinned Black Individuals Messi H. J. Lee et.al. 2412.09668 null
2024-12-12 Exemplar Masking for Multimodal Incremental Learning Yi-Lun Lee et.al. 2412.09549 link
2024-12-12 Embeddings are all you need! Achieving High Performance Medical Image Classification through Training-Free Embedding Analysis Raj Hansini Khoiwal et.al. 2412.09445 null
2024-12-12 Enhancing Modality Representation and Alignment for Multimodal Cold-start Active Learning Meng Shen et.al. 2412.09126 null
2024-12-12 A Wander Through the Multimodal Landscape: Efficient Transfer Learning via Low-rank Sequence Multimodal Adapter Zirun Guo et.al. 2412.08979 null
2024-12-11 StreamChat: Chatting with Streaming Video Jihao Liu et.al. 2412.08646 null
2024-12-11 Multimodal Latent Language Modeling with Next-Token Diffusion Yutao Sun et.al. 2412.08635 link
2024-12-12 Design2GarmentCode: Turning Design Concepts to Tangible Garments Through Program Synthesis Feng Zhou et.al. 2412.08603 null
2024-12-11 Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual Illusions Mohammadmostafa Rostamkhani et.al. 2412.08169 link
2024-12-10 Explaining and Mitigating the Modality Gap in Contrastive Multimodal Learning Can Yaras et.al. 2412.07909 null
2024-12-10 BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities Sahal Shaji Mullappilly et.al. 2412.07769 link
2024-12-10 ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer Jinyi Hu et.al. 2412.07720 link
2024-12-13 DriveMM: All-in-One Large Multimodal Model for Autonomous Driving Zhijian Huang et.al. 2412.07689 link
2024-12-10 Driving with InternVL: Oustanding Champion in the Track on Driving with Language of the Autonomous Grand Challenge at CVPR 2024 Jiahan Li et.al. 2412.07247 null
2024-12-10 Maya: An Instruction Finetuned Multilingual Multimodal Model Nahid Alam et.al. 2412.07112 link
2024-12-09 How to Merge Your Multimodal Models Over Time? Sebastian Dziadzio et.al. 2412.06712 link
2024-12-09 Ranked from Within: Ranking Large Multimodal Models for Visual Question Answering Without Labels Weijie Tu et.al. 2412.06461 null
2024-12-09 iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models Lianyu Hu et.al. 2412.06263 link
2024-12-08 A Self-Learning Multimodal Approach for Fake News Detection Hao Chen et.al. 2412.05843 null
2024-12-08 SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation Leigang Qu et.al. 2412.05818 null
2024-12-07 WavFusion: Towards wav2vec 2.0 Multimodal Speech Emotion Recognition Feng Li et.al. 2412.05558 null
2024-12-07 Comprehensive Evaluation of Multimodal AI Models in Medical Imaging Diagnosis: From Data Augmentation to Preference-Based Comparison Cailian Ruan et.al. 2412.05536 null
2024-12-06 Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Zhe Chen et.al. 2412.05271 link
2024-12-05 Lattice Lingo: Effect of Textual Detail on Multimodal Learning for Property Prediction of Crystals Mrigi Munjal et.al. 2412.04670 null
2024-12-05 BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks Juan Rodriguez et.al. 2412.04626 null
2024-12-05 MageBench: Bridging Large Multimodal Models to Agents Miaosen Zhang et.al. 2412.04531 link
2024-12-04 Video Quality Assessment: A Comprehensive Survey Qi Zheng et.al. 2412.04508 link
2024-12-05 SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model Zhenglin Huang et.al. 2412.04292 null
2024-12-05 CALMM-Drive: Confidence-Aware Autonomous Driving with Large Multimodal Model Ruoyu Yao et.al. 2412.04209 null
2024-12-05 AIpparel: A Large Multimodal Generative Model for Digital Garments Kiyohiro Nakayama et.al. 2412.03937 null
2024-12-05 MegaCOIN: Enhancing Medium-Grained Color Perception for Vision-Language Models Ming-Chang Chiu et.al. 2412.03927 link
2024-12-04 Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning Wujian Peng et.al. 2412.03565 link
2024-12-04 Training-Free Mitigation of Language Reasoning Degradation After Multimodal Instruction Tuning Neale Ratzlaff et.al. 2412.03467 null
2024-12-06 SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection Joongwon Chae et.al. 2412.02565 link
2024-12-03 Initial Study On Improving Segmentation By Combining Preoperative CT And Intraoperative CBCT Using Synthetic Data Maximilian E. Tschuchnig et.al. 2412.02294 null
2024-12-05 CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy Zhibo Yang et.al. 2412.02210 null
2024-12-03 VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding Kangsan Kim et.al. 2412.02186 link
2024-12-04 Agri-LLaVA: Knowledge-Infused Large Multimodal Assistant on Agricultural Pests and Diseases Liqiong Wang et.al. 2412.02158 link
2024-12-02 Attacks on multimodal models Viacheslav Iablochnikov et.al. 2412.01725 link
2024-12-02 LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant Yikun Liu et.al. 2412.01720 null
2024-12-01 VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation Weiming Ren et.al. 2412.00927 null
2024-11-30 MaintAGT:Sim2Real-Guided Multimodal Large Model for Intelligent Maintenance with Chain-of-Thought Reasoning Hongliang He et.al. 2412.00481 null
2024-11-30 Approximate Fiber Product: A Preliminary Algebraic-Geometric Perspective on Multimodal Embedding Alignment Dongfang Zhao et.al. 2412.00373 null
2024-12-04 ROSE: Revolutionizing Open-Set Dense Segmentation with Patch-Wise Perceptual Large Multimodal Model Kunyang Han et.al. 2412.00153 null
2024-11-28 Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers Chancharik Mitra et.al. 2412.00142 null
2024-12-02 LUMIA: Linear probing for Unimodal and MultiModal Membership Inference Attacks leveraging internal LLM states Luis Ibanez-Lissen et.al. 2411.19876 null
2024-11-29 SDR-GNN: Spectral Domain Reconstruction Graph Neural Network for Incomplete Multimodal Learning in Conversational Emotion Recognition Fangze Fu et.al. 2411.19822 null
2024-11-29 JetFormer: An Autoregressive Generative Model of Raw Images and Text Michael Tschannen et.al. 2411.19722 null
2024-11-28 Beyond Logit Lens: Contextual Embeddings for Robust Hallucination Detection & Grounding in VLMs Anirudh Phukan et.al. 2411.19187 null
2024-11-28 Examining Multimodal Gender and Content Bias in ChatGPT-4o Roberto Balestri et.al. 2411.19140 null
2024-11-28 ScratchEval: Are GPT-4o Smarter than My Child? Evaluating Large Multimodal Models with Visual Programming Challenges Rao Fu et.al. 2411.18932 link
2024-11-27 Active Data Curation Effectively Distills Large-Scale Multimodal Models Vishaal Udandarao et.al. 2411.18674 null
2024-11-27 AMPS: ASR with Multimodal Paraphrase Supervision Amruta Parulekar et.al. 2411.18368 null
2024-12-03 Large Language Model-Brained GUI Agents: A Survey Chaoyun Zhang et.al. 2411.18279 link
2024-11-27 Grid-augumented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agents Joongwon Chae et.al. 2411.18270 link
2024-11-27 Multimodal Integration of Longitudinal Noninvasive Diagnostics for Survival Prediction in Immunotherapy Using Deep Learning Melda Yeghaian et.al. 2411.18253 null
2024-11-26 NEMO: Can Multimodal LLMs Identify Attribute-Modified Objects? Jiaxuan Li et.al. 2411.17794 null
2024-11-26 Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis Akshita Gupta et.al. 2411.17690 null
2024-11-26 AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM Jiarui Wang et.al. 2411.17221 link
2024-11-26 Learning Robust Anymodal Segmentor with Unimodal and Cross-modal Distillation Xu Zheng et.al. 2411.17141 link
2024-11-26 Relations, Negations, and Numbers: Looking for Logic in Generative Text-to-Image Models Colin Conwell et.al. 2411.17066 link
2024-11-26 Multimodal Alignment and Fusion: A Survey Songtao Li et.al. 2411.17040 null
2024-11-27 SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE Yongwei Chen et.al. 2411.16856 null
2024-11-23 Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents Jun Chen et.al. 2411.16740 link
2024-11-26 All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages Ashmal Vayani et.al. 2411.16508 link
2024-11-25 Boosting 3D Object Generation through PBR Materials Yitong Wang et.al. 2411.16080 null
2024-11-24 M3-CVC: Controllable Video Compression with Multimodal Generative Models Rui Wan et.al. 2411.15798 null
2024-11-23 Knowledge Transfer Across Modalities with Natural Language Supervision Carlo Alberto Barbano et.al. 2411.15611 null
2024-11-23 From Complexity to Parsimony: Integrating Latent Class Analysis to Uncover Multimodal Learning Patterns in Collaborative Learning Lixiang Yan et.al. 2411.15590 null
2024-11-23 Botfip-LLM: An Enhanced Multimodal Scientific Computing Framework Leveraging Knowledge Distillation from Large Language Models Tianhao Chen et.al. 2411.15525 null
2024-11-23 MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking Xinqi Liu et.al. 2411.15459 null
2024-11-23 freePruner: A Training-free Approach for Large Multimodal Model Acceleration Bingxin Xu et.al. 2411.15446 null
2024-11-22 PRIMUS: Pretraining IMU Encoders with Multimodal Self-Supervision Arnav M. Das et.al. 2411.15127 null
2024-11-22 Large Multi-modal Models Can Interpret Features in Large Multi-modal Models Kaichen Zhang et.al. 2411.14982 link
2024-11-25 Information Extraction from Heterogeneous Documents without Ground Truth Labels using Synthetic Label Generation and Knowledge Distillation Aniket Bhattacharyya et.al. 2411.14957 null
2024-11-22 Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains Yurii Paniv et.al. 2411.14647 null
2024-11-21 Generative AI for Music and Audio Hao-Wen Dong et.al. 2411.14627 null
2024-11-21 FuseGPT: Learnable Layers Fusion of Generative Pre-trained Transformers Zehua Pei et.al. 2411.14507 null
2024-11-21 MMGenBench: Evaluating the Limits of LMMs from the Text-to-Image Generation Perspective Hailang Huang et.al. 2411.14062 link
2024-11-21 Multimodal 3D Reasoning Segmentation with Complex Scenes Xueying Jiang et.al. 2411.13927 null
2024-11-20 VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation Ziyang Luo et.al. 2411.13281 null
2024-11-19 VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge Vishwesh Nath et.al. 2411.12915 null
2024-11-19 Mitigating Perception Bias: A Training-Free Approach to Enhance LMM for Image Quality Assessment Siyi Pan et.al. 2411.12791 null
2024-11-18 MMBind: Unleashing the Potential of Distributed and Heterogeneous Data for Multimodal Learning in IoT Xiaomin Ouyang et.al. 2411.12126 null
2024-11-17 SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization Hongrui Jia et.al. 2411.11909 link
2024-11-18 The Power of Many: Multi-Agent Multimodal Models for Cultural Image Captioning Longju Bai et.al. 2411.11758 link
2024-11-18 Artificial Scientific Discovery Antonio Norelli et.al. 2411.11672 null
2024-11-18 InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models Yu Yan et.al. 2411.11394 null
2024-11-19 SoK: Unifying Cybersecurity and Cybersafety of Multimodal Foundation Models with an Information Theory Approach Ruoxi Sun et.al. 2411.11195 null
2024-11-16 ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models Vipula Rawte et.al. 2411.10867 null
2024-11-19 MLAN: Language-Based Instruction Tuning Improves Zero-Shot Generalization of Multimodal Large Language Models Jianhong Tu et.al. 2411.10557 link
2024-11-15 Everything is a Video: Unifying Modalities through Next-Frame Prediction G. Thomas Hudson et.al. 2411.10503 null
2024-11-15 Weakly-Supervised Multimodal Learning on MIMIC-CXR Andrea Agostini et.al. 2411.10356 link
2024-11-21 Instruction-Guided Editing Controls for Images and Multimedia: A Survey in LLM era Thanh Tam Nguyen et.al. 2411.09955 link
2024-11-14 Cross-Modal Consistency in Multimodal Large Language Models Xiang Zhang et.al. 2411.09273 null
2024-11-14 SmartInv: Multimodal Learning for Smart Contract Invariant Inference Sally Junsong Wang et.al. 2411.09217 null
2024-11-13 Multimodal Object Detection using Depth and Image Data for Manufacturing Parts Nazanin Mahjourian et.al. 2411.09062 null
2024-11-13 Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions Moran Yanuka et.al. 2411.09018 null
2024-11-13 AstroM $^3$ : A self-supervised multimodal model for astronomy Mariia Rizhko et.al. 2411.08842 null
2024-11-13 Multimodal Instruction Tuning with Hybrid State Space Models Jianing Zhou et.al. 2411.08840 null
2024-11-13 Retrieval Augmented Recipe Generation Guoshan Liu et.al. 2411.08715 null
2024-11-12 DPU: Dynamic Prototype Updating for Multimodal Out-of-Distribution Detection Shawn Li et.al. 2411.08227 link
2024-11-12 Leveraging Multimodal Models for Enhanced Neuroimaging Diagnostics in Alzheimer's Disease Francesco Chiumento et.al. 2411.07871 null
2024-11-12 SparrowVQE: Visual Question Explanation for Course Content Understanding Jialu Li et.al. 2411.07516 link
2024-11-12 BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions Anas Awadalla et.al. 2411.07461 null
2024-11-11 Multimodal Fusion Balancing Through Game-Theoretic Regularization Konstantinos Kontras et.al. 2411.07335 null
2024-11-11 OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision Cong Wei et.al. 2411.07199 null
2024-11-09 M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework Yew Ken Chia et.al. 2411.06176 null
2024-11-09 An Empirical Analysis on Spatial Reasoning Capabilities of Large Multimodal Models Fatemeh Shiri et.al. 2411.06048 link
2024-11-08 Towards Low-Resource Harmful Meme Detection with LMM Agents Jianzhao Huang et.al. 2411.05383 link
2024-11-08 Exploring the Alignment Landscape: LLMs and Geometric Deep Models in Protein Representation Dong Shu et.al. 2411.05316 link
2024-11-07 HourVideo: 1-Hour Video-Language Understanding Keshigeyan Chandrasegaran et.al. 2411.04998 link
2024-11-07 VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos Shehan Munasinghe et.al. 2411.04923 null
2024-11-07 Exploring Hierarchical Molecular Graph Representation in Multimodal LLMs Chengxin Hu et.al. 2411.04708 null
2024-11-06 AutoGameUI: Constructing High-Fidelity Game UIs via Multimodal Learning and Interactive Web-Based Tool Zhongliang Tang et.al. 2411.03709 null
2024-11-05 MME-Finance: A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning Ziliang Gan et.al. 2411.03314 null
2024-11-05 HumanVLM: Foundation for Human-Scene Vision-Language Model Dawei Dai et.al. 2411.03034 null
2024-11-05 Toward Robust Incomplete Multimodal Sentiment Analysis via Hierarchical Representation Learning Mingcheng Li et.al. 2411.02793 null
2024-11-11 INQUIRE: A Natural World Text-to-Image Retrieval Benchmark Edward Vendrow et.al. 2411.02537 link
2024-11-04 See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers Jiaxin Zhuang et.al. 2411.02465 null
2024-11-07 TableGPT2: A Large Multimodal Model with Tabular Data Integration Aofeng Su et.al. 2411.02059 link
2024-11-04 Foundations and Recent Trends in Multimodal Mobile Agents: A Survey Biao Wu et.al. 2411.02006 link
2024-11-04 KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension Jie Yang et.al. 2411.01846 null
2024-11-03 EEE-Bench: A Comprehensive Multimodal Electrical And Electronics Engineering Benchmark Ming Li et.al. 2411.01492 null
2024-11-03 Classifier-guided Gradient Modulation for Enhanced Multimodal Learning Zirun Guo et.al. 2411.01409 link
2024-11-02 LoRA-Contextualizing Adaptation of Large Multimodal Models for Long Document Understanding Jian Chen et.al. 2411.01106 null
2024-11-01 Text2Freq: Learning Series Patterns from Text via Frequency Domain Ming-Chih Lo et.al. 2411.00929 null
2024-11-01 V-LoRA: An Efficient and Flexible System Boosts Vision Applications with LoRA LMM Liang Mi et.al. 2411.00915 null
2024-11-01 Analyzing Multimodal Integration in the Variational Autoencoder from an Information-Theoretic Perspective Carlotta Langer et.al. 2411.00522 null
2024-10-31 TurtleBench: A Visual Programming Benchmark in Turtle Geometry Sina Rismanchian et.al. 2411.00264 link
2024-10-31 ResiDual Transformer Alignment with Spectral Decomposition Lorenzo Basile et.al. 2411.00246 null
2024-10-31 Nearest Neighbor Normalization Improves Multimodal Retrieval Neil Chowdhury et.al. 2410.24114 link
2024-11-04 AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents Yifan Xu et.al. 2410.24024 link
2024-10-31 Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models Hao Yang et.al. 2410.23861 null
2024-10-30 CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP Tianyu Yang et.al. 2410.23330 null
2024-10-30 EMMA: End-to-End Multimodal Model for Autonomous Driving Jyh-Jing Hwang et.al. 2410.23262 null
2024-10-29 ProMQA: Question Answering Dataset for Multimodal Procedural Activity Understanding Kimihiro Hasegawa et.al. 2410.22211 link
2024-10-29 Beyond Text: Optimizing RAG with Multimodal Inputs for Industrial Applications Monica Riedler et.al. 2410.21943 link
2024-10-28 AiSciVision: A Framework for Specializing Large Multimodal Models in Scientific Image Classification Brendan Hogan et.al. 2410.21480 link
2024-10-27 Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse Ryan Liu et.al. 2410.21333 null
2024-10-28 IndraEye: Infrared Electro-Optical UAV-based Perception Dataset for Robust Downstream Tasks Manjunath D et.al. 2410.20953 link
2024-10-27 Generator Matching: Generative modeling with arbitrary Markov processes Peter Holderrieth et.al. 2410.20587 null
2024-10-27 PaPaGei: Open Foundation Models for Optical Physiological Signals Arvind Pillai et.al. 2410.20542 link
2024-10-25 Turn-by-Turn Indoor Navigation for the Visually Impaired Santosh Srinivasaiah et.al. 2410.19954 null
2024-10-25 A Multimodal Approach For Endoscopic VCE Image Classification Using BiomedCLIP-PubMedBERT Nagarajan Ganapathy et.al. 2410.19944 link
2024-10-25 OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization Hongliang He et.al. 2410.19609 link
2024-10-24 Visual Text Matters: Improving Text-KVQA with Visual Text Entity Knowledge-aware Large Multimodal Assistant Abhirama Subramanyam Penamakuri et.al. 2410.19144 link
2024-10-24 VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks Lawrence Jang et.al. 2410.19100 null
2024-10-24 CAMEL-Bench: A Comprehensive Arabic LMM Benchmark Sara Ghaboura et.al. 2410.18976 link
2024-10-24 Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques David Ortiz-Perez et.al. 2410.18972 null
2024-10-24 OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning Xiaoqiang Wang et.al. 2410.18963 null
2024-10-24 A Survey of Multimodal Sarcasm Detection Shafkat Farabi et.al. 2410.18882 null
2024-10-27 R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models Linger Deng et.al. 2410.17885 link
2024-10-22 JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation Shota Onohara et.al. 2410.17250 null
2024-10-22 An Eye for an AI: Evaluating GPT-4o's Visual Perception Skills and Geometric Reasoning Skills Using Computer Graphics Questions Tony Haoran Feng et.al. 2410.16991 null
2024-10-21 DocEdit-v2: Document Structure Editing Via Multimodal LLM Grounding Manan Suri et.al. 2410.16472 null
2024-10-21 Promoting cross-modal representations to improve multimodal foundation models for physiological signals Ching Fang et.al. 2410.16424 null
2024-10-22 Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance Zhangwei Gao et.al. 2410.16261 link
2024-10-22 MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report Samrajya Thapa et.al. 2410.16239 link
2024-10-21 Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models Yufei Zhan et.al. 2410.16163 link
2024-10-21 LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze Dataset Ruikun Zhang et.al. 2410.16095 link
2024-10-21 How to Build a Pre-trained Multimodal model for Simultaneously Chatting and Decision-making? Zuojin Tang et.al. 2410.15885 null
2024-10-21 Multimodal Learning for Embryo Viability Prediction in Clinical IVF Junsik Kim et.al. 2410.15581 null
2024-10-20 IPO: Interpretable Prompt Optimization for Vision-Language Models Yingjun Du et.al. 2410.15397 link
2024-10-20 Modality-Fair Preference Optimization for Trustworthy MLLM Alignment Songtao Jiang et.al. 2410.15334 null
2024-10-19 ChitroJera: A Regionally Relevant Visual Question Answering Dataset for Bangla Deeparghya Dutta Barua et.al. 2410.14991 null
2024-10-19 SemiHVision: Enhancing Medical Multimodal Models with a Semi-Human Annotated Dataset and Fine-Tuned Instruction Generation Junda Wang et.al. 2410.14948 link
2024-10-18 Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension Yin Xie et.al. 2410.14332 link
2024-10-18 Personalized Image Generation with Large Multimodal Models Yiyan Xu et.al. 2410.14170 null
2024-10-18 Coherence-Driven Multimodal Safety Dialogue with Active Learning for Embodied Agents Sabit Hassan et.al. 2410.14141 null
2024-10-17 Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation Chengyue Wu et.al. 2410.13848 link
2024-10-18 Harnessing Webpage UIs for Text-Rich Visual Understanding Junpeng Liu et.al. 2410.13824 null
2024-10-17 Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR Abhishek Gupta et.al. 2410.13445 null
2024-10-16 The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio Sicong Leng et.al. 2410.12787 null
2024-10-16 HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks Fengji Zhang et.al. 2410.12381 link
2024-10-15 CtrlSynth: Controllable Image Text Synthesis for Data-Efficient Multimodal Learning Qingqing Cao et.al. 2410.11963 null
2024-10-15 Generalizable Spacecraft Trajectory Generation via Multimodal Learning with Transformers Davide Celestini et.al. 2410.11723 null
2024-10-15 Unveiling the Mystery of Visual Attributes of Concrete and Abstract Concepts: Variability, Nearest Neighbors, and Challenging Categories Tarun Tater et.al. 2410.11657 link
2024-10-15 On-the-fly Modulation for Balanced Multimodal Learning Yake Wei et.al. 2410.11582 link
2024-10-15 Enhancing Unimodal Latent Representations in Multimodal VAEs through Iterative Amortized Inference Yuta Oshima et.al. 2410.11403 null
2024-10-14 Saliency Guided Optimization of Diffusion Latents Xiwen Wang et.al. 2410.10257 null
2024-10-14 MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models Peng Xia et.al. 2410.10139 link
2024-10-13 LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models Junyan Ye et.al. 2410.09732 null
2024-10-12 Reconstructive Visual Instruction Tuning Haochen Wang et.al. 2410.09575 null
2024-10-11 Can GPTs Evaluate Graphic Design Based on Design Principles? Daichi Haraguchi et.al. 2410.08885 null
2024-10-11 VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding Houlun Chen et.al. 2410.08593 link
2024-10-10 ElasticTok: Adaptive Tokenization for Image and Video Wilson Yan et.al. 2410.08368 null
2024-10-10 Flex-MoE: Modeling Arbitrary Modality Combination via the Flexible Mixture-of-Experts Sukwon Yun et.al. 2410.08245 link
2024-10-10 LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts Anh-Quan Cao et.al. 2410.08211 null
2024-10-10 Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision Shengcao Cao et.al. 2410.08209 null
2024-10-10 MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models Wenbo Hu et.al. 2410.08182 null
2024-10-10 Generated Bias: Auditing Internal Bias Dynamics of Text-To-Image Generative Models Abhishek Mandal et.al. 2410.07884 null
2024-10-09 The Cognitive Capabilities of Generative AI: A Comparative Analysis with Human Benchmarks Isaac R. Galatzer-Levy et.al. 2410.07391 null
2024-10-12 Deep Correlated Prompting for Visual Recognition with Missing Modalities Lianyu Hu et.al. 2410.06558 link
2024-10-11 Chip-Tuning: Classify Before Language Models Say Fangwei Zhu et.al. 2410.06541 link
2024-10-09 Does Spatial Cognition Emerge in Frontier Models? Santhosh Kumar Ramakrishnan et.al. 2410.06468 null
2024-10-08 Multimodal Representation Learning using Adaptive Graph Construction Weichen Huang et.al. 2410.06395 null
2024-10-08 Temporal Image Caption Retrieval Competition -- Description and Results Jakub Pokrywka et.al. 2410.06314 null
2024-10-08 PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling Xudong Xie et.al. 2410.05970 link
2024-10-08 ModalPrompt:Dual-Modality Guided Prompt for Continual Learning of Large Multimodal Models Fanhu Zeng et.al. 2410.05849 null
2024-10-08 Multimodal Large Language Models and Tunings: Vision, Language, Sensors, Audio, and Beyond Soyeon Caren Han et.al. 2410.05608 link
2024-10-08 TeaserGen: Generating Teasers for Long Documentaries Weihan Xu et.al. 2410.05586 null
2024-10-07 R-Bench: Are your Large Multimodal Model Robust to Real-world Corruptions? Chunyi Li et.al. 2410.05474 link
2024-10-07 RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction Yuwei Zhang et.al. 2410.05361 null
2024-10-07 Patch is Enough: Naturalistic Adversarial Patch against Vision-Language Pre-training Models Dehong Kong et.al. 2410.04884 null
2024-10-06 VISTA: A Visual and Textual Attention Dataset for Interpreting Multimodal Models Harshit et.al. 2410.04609 null
2024-10-06 UniMuMo: Unified Text, Music and Motion Generation Han Yang et.al. 2410.04534 link
2024-10-08 Gamified crowd-sourcing of high-quality data for visual fine-tuning Shashank Yadav et.al. 2410.04038 null
2024-10-07 Multimodal Point-of-Interest Recommendation Yuta Kanzawa et.al. 2410.03265 null
2024-10-04 Bridging the Gap between Text, Audio, Image, and Any Sequence: A Novel Approach using Gloss-based Annotation Sen Fang et.al. 2410.03146 null
2024-10-04 AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark Wenhao Chai et.al. 2410.03051 null
2024-10-07 CPFD: Confidence-aware Privileged Feature Distillation for Short Video Classification Jinghao Shi et.al. 2410.03038 null
2024-10-07 MMP: Towards Robust Multi-Modal Learning with Masked Modality Projection Niki Nezakati et.al. 2410.03010 null
2024-10-03 Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos Jianrui Zhang et.al. 2410.02763 null
2024-10-03 Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models Zhengfeng Lai et.al. 2410.02740 null
2024-10-04 Video Instruction Tuning With Synthetic Data Yuanhan Zhang et.al. 2410.02713 null
2024-10-03 LLaVA-Critic: Learning to Evaluate Multimodal Models Tianyi Xiong et.al. 2410.02712 null
2024-10-03 Plots Unlock Time-Series Understanding in Multimodal Models Mayank Daswani et.al. 2410.02637 null
2024-10-02 Anchors Aweigh! Sail for Optimal Unified Multi-Modal Representations Minoh Jeong et.al. 2410.02086 null
2024-10-02 Toward a Holistic Evaluation of Robustness in CLIP Models Weijie Tu et.al. 2410.01534 null
2024-10-02 SHAP-CAT: A interpretable multi-modal framework enhancing WSI classification via virtual staining and shapley-value-based multimodal fusion Jun Wang et.al. 2410.01408 null
2024-10-02 Backdooring Vision-Language Models with Out-Of-Distribution Data Weimin Lyu et.al. 2410.01264 null
2024-10-02 OCC-MLLM:Empowering Multimodal Large Language Model For the Understanding of Occluded Objects Wenmo Qiu et.al. 2410.01261 null
2024-09-30 Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning Weitai Kang et.al. 2410.00255 link
2024-09-30 Using Large Multimodal Models to Extract Knowledge Components for Knowledge Tracing from Multimedia Question Information Hyeongdon Moon et.al. 2409.20167 link
2024-10-02 Visual Context Window Extension: A New Perspective for Long Video Understanding Hongchen Wei et.al. 2409.20018 null
2024-09-30 Towards Robust Multimodal Sentiment Analysis with Incomplete Data Haoyu Zhang et.al. 2409.20012 link
2024-09-28 FairPIVARA: Reducing and Assessing Biases in CLIP-Based Multimodal Models Diego A. B. Moreira et.al. 2409.19474 link
2024-09-28 From Unimodal to Multimodal: Scaling up Projectors to Align Modalities Mayug Maniparambil et.al. 2409.19425 null
2024-10-02 CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling Jihai Zhang et.al. 2409.19291 link
2024-09-28 TrojVLM: Backdoor Attack Against Vision Language Models Weimin Lyu et.al. 2409.19232 null
2024-09-27 Multimodal Markup Document Models for Graphic Design Completion Kotaro Kikuchi et.al. 2409.19051 null
2024-09-27 Emu3: Next-Token Prediction is All You Need Xinlong Wang et.al. 2409.18869 null
2024-09-27 Data Analysis in the Era of Generative AI Jeevana Priya Inala et.al. 2409.18475 null
2024-09-26 MultiClimate: Multimodal Stance Detection on Climate Change Videos Jiawen Wang et.al. 2409.18346 link
2024-09-26 LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness Chenming Zhu et.al. 2409.18125 null
2024-09-26 GSON: A Group-based Social Navigation Framework with Large Multimodal Model Shangyi Luo et.al. 2409.18084 null
2024-09-26 A Multimodal Single-Branch Embedding Network for Recommendation in Cold-Start and Missing Modality Scenarios Christian Ganhör et.al. 2409.17864 link
2024-09-26 Harnessing Shared Relations via Multimodal Mixup Contrastive Learning for Multimodal Classification Raja Kumar et.al. 2409.17777 link
2024-09-26 MIO: A Foundation Model on Multimodal Tokens Zekun Wang et.al. 2409.17692 link
2024-09-25 Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models Matt Deitke et.al. 2409.17146 link
2024-09-24 CDChat: A Large Multimodal Model for Remote Sensing Change Description Mubashir Noman et.al. 2409.16261 link
2024-09-24 CLSP: High-Fidelity Contrastive Language-State Pre-training for Agent State Representation Fuxian Huang et.al. 2409.15806 null
2024-09-18 Recommendation with Generative Models Yashar Deldjoo et.al. 2409.15173 null
2024-09-23 With Ears to See and Eyes to Hear: Sound Symbolism Experiments with Multimodal Large Language Models Tyler Loakman et.al. 2409.14917 link
2024-09-22 Patch Ranking: Efficient CLIP by Learning to Rank Local Patches Cheng-En Wu et.al. 2409.14607 null
2024-09-22 Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models Yew Ken Chia et.al. 2409.14277 null
2024-09-20 Brain-Cognition Fingerprinting via Graph-GCCA with Contrastive Learning Yixin Wang et.al. 2409.13887 null
2024-09-20 Instruction-guided Multi-Granularity Segmentation and Captioning with Large Multimodal Model Li Zhou et.al. 2409.13407 link
2024-09-20 A Novel Adaptive Fine-Tuning Algorithm for Multimodal Models: Self-Optimizing Classification and Selection of High-Quality Datasets in Remote Sensing Yi Ren et.al. 2409.13345 null
2024-09-20 ChemDFM-X: Towards Large Multimodal Model for Chemistry Zihan Zhao et.al. 2409.13194 null
2024-09-19 MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines Dongzhi Jiang et.al. 2409.12959 null
2024-09-24 TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation Junjie Wen et.al. 2409.12514 null
2024-09-18 Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Peng Wang et.al. 2409.12191 link
2024-09-18 All-in-one foundational models learning across quantum chemical levels Yuxinxin Chen et.al. 2409.12015 link
2024-09-18 LMMCoDrive: Cooperative Driving with Large Multimodal Model Haichao Liu et.al. 2409.11981 link
2024-09-16 MusicLIME: Explainable Multimodal Music Understanding Theodoros Sotirou et.al. 2409.10496 link
2024-09-19 IRIS: Interactive Responsive Intelligent Segmentation for 3D Affordance Analysis Meng Chu et.al. 2409.10078 null
2024-09-16 AceParse: A Comprehensive Dataset with Diverse Structured Texts for Academic Literature Parsing Huawei Ji et.al. 2409.10016 link
2024-09-14 Keypoints-Integrated Instruction-Following Data Generation for Enhanced Human Pose Understanding in Multimodal Models Dewen Zhang et.al. 2409.09306 null
2024-09-13 Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing Minh-Duc Vu et.al. 2409.08885 null
2024-09-13 A Multimodal Approach for Fluid Overload Prediction: Integrating Lung Ultrasound and Clinical Data Tianqi Yang et.al. 2409.08790 null
2024-09-13 Dynamics of Collective Group Affect: Group-level Annotations and the Multimodal Modeling of Convergence and Divergence Navin Raj Prabhu et.al. 2409.08578 null
2024-09-13 A Comprehensive Survey on Deep Multimodal Learning with Missing Modality Renjie Wu et.al. 2409.07825 null
2024-09-12 Top-down Activity Representation Learning for Video Question Answering Yanan Wang et.al. 2409.07748 null
2024-09-11 What to align in multimodal contrastive learning? Benoit Dufumier et.al. 2409.07402 null
2024-09-11 MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis Hanyu Jiang et.al. 2409.07129 null
2024-09-11 FSMDet: Vision-guided feature diffusion for fully sparse 3D detector Tianran Liu et.al. 2409.06945 null
2024-09-16 Scaling Law Hypothesis for Multimodal Model Qingyun Sun et.al. 2409.06754 null
2024-09-10 Multiclass Arrhythmia Classification using Smartwatch Photoplethysmography Signals Collected in Real-life Settings Dong Han et.al. 2409.06147 null
2024-09-11 A Survey of Multimodal Composite Editing and Retrieval Suyan Li et.al. 2409.05405 link
2024-09-05 Learning in Order! A Sequential Strategy to Learn Invariant Features for Multimodal Sentiment Analysis Xianbing Zhao et.al. 2409.04473 null
2024-09-06 Generating Faithful and Salient Text from Multimodal Data Tahsina Hashem et.al. 2409.03961 link
2024-09-06 CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models Wentao Liu et.al. 2409.02834 link
2024-09-10 MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark Xiang Yue et.al. 2409.02813 null
2024-09-04 Understanding eGFR Trajectories and Kidney Function Decline via Large Multimodal Models Chih-Yuan Li et.al. 2409.02530 null
2024-09-03 Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models Bin Fu et.al. 2409.01560 null
2024-09-03 Think Twice Before Recognizing: Large Multimodal Models for General Fine-grained Traffic Sign Recognition Yaozong Gan et.al. 2409.01534 null
2024-09-02 Towards General Industrial Intelligence: A Survey on IIoT-Enhanced Continual Large Models Jiao Chen et.al. 2409.01207 null
2024-09-02 Recoverable Compression: A Multimodal Vision Token Recovery Mechanism Guided by Text Information Yi Chen et.al. 2409.01179 null
2024-08-31 Comparative Analysis of Modality Fusion Approaches for Audio-Visual Person Identification and Verification Aref Farhadipour et.al. 2409.00562 null
2024-08-30 UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios Baichuan Zhou et.al. 2408.17267 null
2024-08-29 Seeking the Sufficiency and Necessity Causal Features in Multimodal Representation Learning Boyu Chen et.al. 2408.16577 null
2024-08-29 Toward Robust Early Detection of Alzheimer's Disease via an Integrated Multimodal Learning Approach Yifei Chen et.al. 2408.16343 link
2024-08-28 Meta-Learn Unimodal Signals with Weak Supervision for Multimodal Sentiment Analysis Sijie Mai et.al. 2408.16029 null
2024-08-28 ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation Tiantian Feng et.al. 2408.15803 null
2024-08-28 Visual Prompt Engineering for Medical Vision Language Models in Radiology Stefan Denner et.al. 2408.15802 null
2024-08-27 X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation Hanjia Lyu et.al. 2408.15172 null
2024-08-27 The Benefits of Balance: From Information Projections to Variance Reduction Lang Liu et.al. 2408.15065 null
2024-08-27 NeuralOOD: Improving Out-of-Distribution Generalization Performance with Brain-machine Fusion Learning Framework Shuangchen Zhao et.al. 2408.14950 null
2024-08-26 MMR: Evaluating Reading Ability of Large Multimodal Models Jian Chen et.al. 2408.14594 null
2024-09-03 Foundation Models for Music: A Survey Yinghao Ma et.al. 2408.14340 link
2024-08-26 LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models Qihang Ge et.al. 2408.14008 null
2024-08-27 Quantum Multimodal Contrastive Learning Framework Chi-Sheng Chen et.al. 2408.13919 null
2024-08-25 Tangram: A Challenging Benchmark for Geometric Element Recognizing Jiamin Tang et.al. 2408.13854 null
2024-08-25 Multimodal Ensemble with Conditional Feature Fusion for Dysgraphia Diagnosis in Children from Handwriting Samples Jayakanth Kunhoth et.al. 2408.13754 null
2024-08-24 Preliminary Investigations of a Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models Sakhinana Sagar Srinivas et.al. 2408.13621 null
2024-08-23 Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption Sakhinana Sagar Srinivas et.al. 2408.13248 null
2024-08-23 Indoor scene recognition from images under visual corruptions Willams de Lima Costa et.al. 2408.13029 null
2024-08-23 Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion Recognition Cam-Van Thi Nguyen et.al. 2408.12895 null
2024-08-23 Has Multimodal Learning Delivered Universal Intelligence in Healthcare? A Comprehensive Survey Qika Lin et.al. 2408.12880 link
2024-08-22 Assessing Modality Bias in Video Question Answering Benchmarks with Multimodal Large Language Models Jean Park et.al. 2408.12763 null
2024-08-22 Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization Luyao Cheng et.al. 2408.12102 null
2024-08-22 Mental-Perceiver: Audio-Textual Multimodal Learning for Mental Health Assessment Jinghui Qin et.al. 2408.12088 null
2024-08-21 GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models Jonathan Roberts et.al. 2408.11817 null
2024-08-21 D-RMGPT: Robot-assisted collaborative tasks driven by large multimodal models M. Forlini et.al. 2408.11761 null
2024-08-21 UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and Generation Xiangyu Zhao et.al. 2408.11305 link
2024-08-21 BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation Haotian Peng et.al. 2408.11281 link
2024-08-20 Exploring the use of Generative AI to Support Automated Just-in-Time Programming for Visual Scene Displays Cynthia Zastudil et.al. 2408.11137 null
2024-08-21 SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition Zebang Cheng et.al. 2408.10500 link
2024-08-19 Enhance Modality Robustness in Text-Centric Multimodal Alignment with Adversarial Prompting Yun-Da Tsai et.al. 2408.09798 null
2024-08-19 Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation Yunxin Li et.al. 2408.09787 link
2024-08-18 PA-LLaVA: A Large Language-Vision Assistant for Human Pathology Image Understanding Dawei Dai et.al. 2408.09530 link
2024-08-17 Measuring Visual Sycophancy in Multimodal Models Jaehyuk Lim et.al. 2408.09111 link
2024-08-16 AdaRank: Disagreement Based Module Rank Prediction for Low-rank Adaptation Yihe Dong et.al. 2408.09015 link
2024-08-16 xGen-MM (BLIP-3): A Family of Open Large Multimodal Models Le Xue et.al. 2408.08872 null
2024-08-16 Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs Jinming Liu et.al. 2408.08575 null
2024-08-15 LLaVA-Surg: Towards Multimodal Surgical Assistant via Structured Surgical Video Learning Jiajie Li et.al. 2408.07981 null
2024-08-15 MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark Minxuan Zhou et.al. 2408.07543 link
2024-08-14 Modality Invariant Multimodal Learning to Handle Missing Modalities: A Single-Branch Approach Muhammad Saad Saeed et.al. 2408.07445 null
2024-08-14 Robust Semi-supervised Multimodal Medical Image Segmentation via Cross Modality Collaboration Xiaogen Zhon et.al. 2408.07341 link
2024-08-14 Enhancing Visual Question Answering through Ranking-Based Hybrid Training and Multimodal Fusion Peiyuan Chen et.al. 2408.07303 null
2024-08-13 PathInsight: Instruction Tuning of Multimodal Datasets and Models for Intelligence Assisted Diagnosis in Histopathology Xiaomin Wu et.al. 2408.07037 null
2024-08-13 EditScribe: Non-Visual Image Editing with Natural Language Verification Loops Ruei-Che Chang et.al. 2408.06632 null
2024-08-13 CROME: Cross-Modal Adapters for Efficient Multimodal LLM Sayna Ebrahimi et.al. 2408.06610 null
2024-08-13 Prioritizing Modalities: Flexible Importance Scheduling in Federated Multimodal Learning Jieming Bian et.al. 2408.06549 null
2024-08-12 VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents Xiao Liu et.al. 2408.06327 link
2024-08-11 HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes Xuanyu Su et.al. 2408.05794 null
2024-08-08 Enhancing Journalism with AI: A Study of Contextualized Image Captioning for News Articles using LLMs and LMMs Aliki Anagnostopoulou et.al. 2408.04331 null
2024-08-06 LLaVA-OneVision: Easy Visual Task Transfer Bo Li et.al. 2408.03326 link
2024-08-06 Multitask and Multimodal Neural Tuning for Large Models Hao Sun et.al. 2408.03001 null
2024-08-06 Body of Her: A Preliminary Study on End-to-End Humanoid Agent Tenglong Ao et.al. 2408.02879 null
2024-08-04 Distribution-Level Memory Recall for Continual Learning: Preserving Knowledge and Avoiding Confusion Shaoxu Cheng et.al. 2408.02695 null
2024-08-02 A Systematic Review of Intermediate Fusion in Multimodal Deep Learning for Biomedical Applications Valerio Guarrasi et.al. 2408.02686 null
2024-08-05 REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models Agneet Chatterjee et.al. 2408.02231 null
2024-08-04 CACE-Net: Co-guidance Attention and Contrastive Enhancement for Effective Audio-Visual Event Localization Xiang He et.al. 2408.01952 link
2024-08-02 MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models Benno Weck et.al. 2408.01337 link
2024-08-05 Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions Jin Gao et.al. 2408.01091 link
2024-08-02 GraphAge: Unleashing the power of Graph Neural Network to Decode Epigenetic Aging Saleh Sakib Ahmed et.al. 2408.00984 link
2024-08-01 MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities Weihao Yu et.al. 2408.00765 link
2024-08-01 GalleryGPT: Analyzing Paintings with Large Multimodal Models Yi Bin et.al. 2408.00491 link
2024-08-01 Everything We Hear: Towards Tackling Misinformation in Podcasts Sachin Pathiyan Cherumanal et.al. 2408.00292 null
2024-08-01 OmniParser for Pure Vision Based GUI Agent Yadong Lu et.al. 2408.00203 null
2024-07-30 Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection Jinfa Huang et.al. 2407.21004 null
2024-07-30 HyperMM : Robust Multimodal Learning with Varying-sized Inputs Hava Chaptoukaev et.al. 2407.20768 null
2024-07-30 Effectively Leveraging CLIP for Generating Situational Summaries of Images and Videos Dhruv Verma et.al. 2407.20642 link
2024-07-29 Adversarial Robustness in RGB-Skeleton Action Recognition: Leveraging Attention Modality Reweighter Chao Liu et.al. 2407.19981 null
2024-07-29 ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2 Wenjun Huang et.al. 2407.19832 null
2024-08-02 XLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training Biao Wu et.al. 2407.19546 link
2024-07-28 Detached and Interactive Multimodal Learning Yunfeng Fan et.al. 2407.19514 link
2024-07-27 Data Processing Techniques for Modern Multimodal Models Yinheng Li et.al. 2407.19180 null
2024-07-26 MangaUB: A Manga Understanding Benchmark for Large Multimodal Models Hikaru Ikuta et.al. 2407.19034 null
2024-07-26 Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment Yuze Zheng et.al. 2407.18854 null
2024-07-26 ChatSchema: A pipeline of extracting structured information with Large Multimodal Models based on schema Fei Wang et.al. 2407.18716 null
2024-07-25 Sparse vs Contiguous Adversarial Pixel Perturbations in Multimodal Models: An Empirical Analysis Cristian-Alexandru Botocan et.al. 2407.18251 link
2024-07-25 $\mathbb{X}$ -Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs Vlad Sobal et.al. 2407.18134 null
2024-07-25 Cross-Vendor Reproducibility of Radiomics-based Machine Learning Models for Computer-aided Diagnosis Jatin Chaudhary et.al. 2407.18060 null
2024-07-25 What does Kiki look like? Cross-modal associations between speech sounds and visual shapes in vision-and-language models Tessa Verhoef et.al. 2407.17974 null
2024-07-25 Shapley Value-based Contrastive Alignment for Multimodal Information Extraction Wen Luo et.al. 2407.17854 null
2024-07-25 Enhancing Model Performance: Another Approach to Vision-Language Instruction Tuning Vedanshu et.al. 2407.17813 null
2024-07-25 KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models Eunice Yiu et.al. 2407.17773 link
2024-07-24 Testing Large Language Models on Driving Theory Knowledge and Skills for Connected Autonomous Vehicles Zuoyin Tang et.al. 2407.17211 null
2024-07-23 Chameleon: Images Are What You Need For Multimodal Learning Robust To Missing Modalities Muhammad Irzam Liaqat et.al. 2407.16243 null
2024-07-22 LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding Haoning Wu et.al. 2407.15754 link
2024-07-22 Resource-Efficient Federated Multimodal Learning via Layer-wise and Progressive Training Ye Lin Tun et.al. 2407.15426 null
2024-07-21 VideoGameBunny: Towards vision assistants for video games Mohammad Reza Taesiri et.al. 2407.15295 null
2024-07-22 Patch-based Intuitive Multimodal Prototypes Network (PIMPNet) for Alzheimer's Disease classification Lisa Anita De Santi et.al. 2407.14277 link
2024-07-18 Visual Haystacks: Answering Harder Questions About Sets of Images Tsung-Han Wu et.al. 2407.13766 link
2024-07-17 Text- and Feature-based Models for Compound Multimodal Emotion Recognition in the Wild Nicolas Richet et.al. 2407.12927 link
2024-07-16 ChatBCG: Can AI Read Your Slide Deck? Nikita Singh et.al. 2407.12875 null
2024-07-17 LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Kaichen Zhang et.al. 2407.12772 link
2024-07-17 Missing Modality Prediction for Unpaired Multimodal Learning via Joint Embedding of Unimodal Models Donggeun Kim et.al. 2407.12616 null
2024-07-17 E5-V: Universal Embeddings with Multimodal Large Language Models Ting Jiang et.al. 2407.12580 link
2024-07-16 FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models Pengxiang Li et.al. 2407.11522 null
2024-07-16 COMET: "Cone of experience" enhanced large multimodal model for mathematical problem generation Sannyuya Liu et.al. 2407.11315 null
2024-07-15 OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models Zijian Zhou et.al. 2407.11213 link
2024-07-15 FabGPT: An Efficient Large Multimodal Model for Complex Wafer Defect Knowledge Queries Yuqi Jiang et.al. 2407.10810 null
2024-07-15 Scaling 3D Reasoning with LMMs to Large Robot Mission Environments Using Datagraphs W. J. Meijer et.al. 2407.10743 null
2024-07-16 Qwen2 Technical Report An Yang et.al. 2407.10671 link
2024-07-15 How and where does CLIP process negation? Vincent Quantmeyer et.al. 2407.10488 null
2024-07-12 Diagnosing and Re-learning for Balanced Multimodal Learning Yake Wei et.al. 2407.09705 link
2024-07-12 Unifying Sequences, Structures, and Descriptions for Any-to-Any Protein Generation with the Large Multimodal Model HelixProtX Zhiyuan Chen et.al. 2407.09274 link
2024-07-12 DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training Chen Xin et.al. 2407.09174 link
2024-07-11 Emerging Practices for Large Multimodal Model (LMM) Assistance for People with Visual Impairments: Implications for Design Jingyi Xie et.al. 2407.08882 null
2024-07-10 RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization Xijie Huang et.al. 2407.08044 link
2024-07-10 LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models Feng Li et.al. 2407.07895 link
2024-07-11 InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis with Semantic Graph Prior Chenguo Lin et.al. 2407.07580 null
2024-07-10 Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model Wenqi Zhang et.al. 2407.07053 link
2024-07-08 ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation Ethan Chern et.al. 2407.06135 link
2024-07-07 Multimodal Language Models for Domain-Specific Procedural Video Summarization Nafisa Hussain et.al. 2407.05419 null
2024-07-07 Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition Zirun Guo et.al. 2407.05374 link
2024-07-06 Enhance the Robustness of Text-Centric Multimodal Alignments Ting-Yu Yen et.al. 2407.05036 null
2024-07-06 Completed Feature Disentanglement Learning for Multimodal MRIs Analysis Tianling Liu et.al. 2407.04916 null
2024-07-06 MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension Zekun Li et.al. 2407.04903 link
2024-07-05 VCoME: Verbal Video Composition with Multimodal Editing Effects Weibo Gong et.al. 2407.04697 null
2024-07-05 Multimodal Classification via Modal-Aware Interactive Enhancement Qing-Yuan Jiang et.al. 2407.04587 null
2024-07-05 Robust Multimodal Learning via Representation Decoupling Shicai Wei et.al. 2407.04458 null
2024-07-05 Smart Vision-Language Reasoners Denisa Roberts et.al. 2407.04212 link
2024-07-04 Investigating the Role of Instruction Variety and Task Difficulty in Robotic Manipulation Tasks Amit Parekh et.al. 2407.03967 link
2024-07-04 ADAPT: Multimodal Learning for Detecting Physiological Changes under Missing Modalities Julie Mordacq et.al. 2407.03836 link
2024-07-04 M $\mathbf5$ -- A Diverse Benchmark to Assess the Performance of Large Multimodal Models Across Multilingual and Multicultural Vision-Language Tasks Florian Schneider et.al. 2407.03791 null
2024-07-03 HEMM: Holistic Evaluation of Multimodal Foundation Models Paul Pu Liang et.al. 2407.03418 link
2024-07-02 Multi-Peptide: Multimodality Leveraged Language-Graph Learning of Peptide Properties Srivathsan Badrinarayanan et.al. 2407.03380 link
2024-07-02 Understanding Alignment in Multimodal LLMs: A Comprehensive Study Elmira Amirloo et.al. 2407.02477 null
2024-07-02 Synthetic Multimodal Question Generation Ian Wu et.al. 2407.02233 null
2024-07-02 Crossroads of Continents: Automated Artifact Extraction for Cultural Adaptation with Large Multimodal Models Anjishnu Mukherjee et.al. 2407.02067 link
2024-07-01 Empathic Grounding: Explorations using Multimodal Interaction and Large Language Models with Conversational Agents Mehdi Arjmand et.al. 2407.01824 link
2024-07-01 We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? Runqi Qiao et.al. 2407.01284 link
2024-07-01 Unaligning Everything: Or Aligning Any Text to Any Image in Multimodal Models Shaeke Salman et.al. 2407.01157 null
2024-06-29 AI-powered multimodal modeling of personalized hemodynamics in aortic stenosis Caglar Ozturk et.al. 2407.00535 null
2024-06-29 MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation Jinsheng Huang et.al. 2407.00468 link
2024-06-29 How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models Jaeyoung Lee et.al. 2407.00369 null
2024-06-28 PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration Yuxuan Sun et.al. 2407.00203 null
2024-06-28 EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model Yuxuan Zhang et.al. 2406.20076 link
2024-06-28 InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding Kirolos Ataallah et.al. 2406.19875 link
2024-06-28 MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis Jun-Yan He et.al. 2406.19859 null
2024-06-28 MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment Jihao Liu et.al. 2406.19736 link
2024-06-28 Enhancing Radiological Diagnosis: A Collaborative Approach Integrating AI and Human Expertise for Visual Miss Correction Akash Awasthi et.al. 2406.19686 null
2024-06-28 SK-VQA: Synthetic Knowledge Generation at Scale for Training Context-Augmented Multimodal LLMs Xin Su et.al. 2406.19593 null
2024-06-27 OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding Tao Zhang et.al. 2406.19389 null
2024-06-28 FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts Shubhankar Singh et.al. 2406.19237 null
2024-06-27 RAVEN: Multitask Retrieval Augmented Vision-Language Learning Varun Nagaraj Rao et.al. 2406.19150 null
2024-06-27 DocKylin: A Large Multimodal Model for Visual Document Understanding with Efficient Visual Slimming Jiaxin Zhang et.al. 2406.19101 null
2024-06-27 Fairness and Bias in Multimodal AI: A Survey Tosin Adewumi et.al. 2406.19097 null
2024-06-27 MissionGNN: Hierarchical Multimodal GNN-based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation Sanggeon Yun et.al. 2406.18815 null
2024-06-26 MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data William Berman et.al. 2406.18790 null
2024-06-26 S3: A Simple Strong Sample-effective Multimodal Dialog System Elisei Rykov et.al. 2406.18305 link
2024-06-26 EHR-Based Mobile and Web Platform for Chronic Disease Risk Prediction Using Large Language Multimodal Models Chun-Chieh Liao et.al. 2406.18087 null
2024-06-26 Speech2UnifiedExpressions: Synchronous Synthesis of Co-Speech Affective Face and Body Expressions from Affordable Inputs Uttaran Bhattacharya et.al. 2406.18068 null
2024-06-25 Human-centered In-building Embodied Delivery Benchmark Zhuoqun Xu et.al. 2406.17898 link
2024-06-25 InFiConD: Interactive No-code Fine-tuning with Concept-based Knowledge Distillation Jinbin Huang et.al. 2406.17838 null
2024-06-25 Data curation via joint example selection further accelerates multimodal learning Talfan Evans et.al. 2406.17711 null
2024-06-25 Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights Hao Yang et.al. 2406.17430 link
2024-06-24 At First Sight: Zero-Shot Classification of Astronomical Images with Large Multimodal Models Dimitrios Tanoglidis et.al. 2406.17057 null
2024-06-24 Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models Jierun Chen et.al. 2406.16866 link
2024-06-24 Long Context Transfer from Language to Vision Peiyuan Zhang et.al. 2406.16852 link
2024-06-24 QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds Ye Wang et.al. 2406.16578 null
2024-06-21 Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning Brandon Huang et.al. 2406.15334 link
2024-06-21 Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models Jiayu Wang et.al. 2406.14852 link
2024-06-20 Evaluating vision-capable chatbots in interpreting kinematics graphs: a comparative study of free and subscription-based models Giulia Polverini et.al. 2406.14685 null
2024-06-20 Revealing Vision-Language Integration in the Brain with Multimodal Networks Vighnesh Subramaniam et.al. 2406.14481 link
2024-06-25 iWISDM: Assessing instruction following in multimodal models at scale Xiaoxuan Lei et.al. 2406.14343 link
2024-06-20 Two Giraffes in a Dirt Field: Using Game Play to Investigate Situation Modelling in Large Multimodal Models Sherzod Hakimov et.al. 2406.14035 null
2024-06-20 Knowledge-driven Subspace Fusion and Gradient Coordination for Multi-modal Learning Yupei Zhang et.al. 2406.13979 link
2024-06-20 PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents Junjie Wang et.al. 2406.13923 null
2024-06-19 Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models Zhawnen Chen et.al. 2406.13763 null
2024-06-19 GUI Action Narrator: Where and When Did That Action Take Place? Qinchen Wu et.al. 2406.13719 null
2024-06-19 Is AI fun? HumorDB: a curated dataset and benchmark to investigate graphical humor Veedant Jain et.al. 2406.13564 null
2024-06-19 VisualRWKV: Exploring Recurrent Neural Networks for Visual Language Models Haowen Hou et.al. 2406.13362 link
2024-06-19 Learnable In-Context Vector for Visual Question Answering Yingzhe Peng et.al. 2406.13185 link
2024-06-18 Synergizing Foundation Models and Federated Learning: A Survey Shenghui Li et.al. 2406.12844 null
2024-06-18 OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI Zhen Huang et.al. 2406.12753 link
2024-06-18 Disturbing Image Detection Using LMM-Elicited Emotion Embeddings Maria Tzelepi et.al. 2406.12668 null
2024-06-18 Automatic benchmarking of large multimodal models via iterative experiment programming Alessandro Conti et.al. 2406.12321 link
2024-06-18 Language and Multimodal Models in Sports: A Survey of Datasets and Applications Haotian Xia et.al. 2406.12252 null
2024-06-17 VideoLLM-online: Online Video Large Language Model for Streaming Video Joya Chen et.al. 2406.11816 null
2024-06-17 LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning Dantong Niu et.al. 2406.11815 null
2024-06-17 Multimodal Learning To Improve Segmentation With Intraoperative CBCT & Preoperative CT Maximilian E. Tschuchnig et.al. 2406.11650 null
2024-06-17 Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment Chao Wen et.al. 2406.11334 null
2024-06-17 VideoVista: A Versatile Benchmark for Video Understanding and Reasoning Yunxin Li et.al. 2406.11303 null
2024-06-17 i-SRT: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective Judgment Daechul Ahn et.al. 2406.11280 link
2024-06-17 MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens Anas Awadalla et.al. 2406.11271 link
2024-06-17 Generative Visual Instruction Tuning Jefferson Hernandez et.al. 2406.11262 link
2024-06-17 Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective Yang Chen et.al. 2406.11249 null
2024-06-16 Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies Hung-Ting Su et.al. 2406.10923 null
2024-06-15 Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model Lu Xu et.al. 2406.10484 link
2024-06-12 MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases Rithesh Murthy et.al. 2406.10290 null
2024-06-14 VideoGUI: A Benchmark for GUI Automation from Instructional Videos Kevin Qinghong Lin et.al. 2406.10227 null
2024-06-14 ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation Chufan Shi et.al. 2406.09961 link
2024-06-14 BiVLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval Imanol Miranda et.al. 2406.09952 link
2024-06-13 VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding Muhammad Maaz et.al. 2406.09418 link
2024-06-13 Explore the Limits of Omni-modal Pretraining at Scale Yiyuan Zhang et.al. 2406.09412 link
2024-06-14 4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities Roman Bachmann et.al. 2406.09406 null
2024-06-13 Yo'LLaVA: Your Personalized Language and Vision Assistant Thao Nguyen et.al. 2406.09400 link
2024-06-13 CMC-Bench: Towards a New Paradigm of Visual Signal Compression Chunyi Li et.al. 2406.09356 link
2024-06-13 Comparison Visual Instruction Tuning Wei Lin et.al. 2406.09240 null
2024-06-13 Zoom and Shift are All You Need Jiahao Qin et.al. 2406.08866 null
2024-06-11 Embedding-based Multimodal Learning on Pan-Squamous Cell Carcinomas for Improved Survival Outcomes Asim Waqas et.al. 2406.08521 null
2024-06-14 Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models Yi-Fan Zhang et.al. 2406.08487 link
2024-06-13 OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text Qingyun Li et.al. 2406.08418 link
2024-06-12 A Concept-Based Explainability Framework for Large Multimodal Models Jayneel Parekh et.al. 2406.08074 link
2024-06-12 LVBench: An Extreme Long Video Understanding Benchmark Weihan Wang et.al. 2406.08035 link
2024-06-11 Cognitive Insights Across Languages: Enhancing Multimodal Interview Analysis David Ortiz-Perez et.al. 2406.07542 link
2024-06-11 Understanding Visual Concepts Across Models Brandon Trabucco et.al. 2406.07506 link
2024-06-11 Unified Modeling Enhanced Multimodal Learning for Precision Neuro-Oncology Huahui Yi et.al. 2406.07078 link
2024-06-14 BTS: Bridging Text and Sound Modalities for Metadata-Aided Respiratory Sound Classification June-Woo Kim et.al. 2406.06786 link
2024-06-10 Vript: A Video Is Worth Thousands of Words Dongjie Yang et.al. 2406.06040 link
2024-06-10 FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model Yebin Lee et.al. 2406.06004 link
2024-06-10 CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark David Romero et.al. 2406.05967 null
2024-06-09 Stealthy Targeted Backdoor Attacks against Image Captioning Wenshu Fan et.al. 2406.05874 link
2024-06-09 F-LMM: Grounding Frozen Large Multimodal Models Size Wu et.al. 2406.05821 link
2024-06-08 Generalist Multimodal AI: A Review of Architectures, Challenges and Opportunities Sai Munikoti et.al. 2406.05496 null
2024-06-07 Semantic Segmentation on VSPW Dataset through Masked Video Consistency Chen Liang et.al. 2406.04979 null
2024-06-07 Predictive Dynamic Fusion Bing Cao et.al. 2406.04802 link
2024-06-07 MGIMM: Multi-Granularity Instruction Multimodal Model for Attribute-Guided Remote Sensing Image Detailed Description Cong Yang et.al. 2406.04716 link
2024-06-07 AICoderEval: Improving AI Domain Code Generation of Large Language Models Yinghui Xia et.al. 2406.04712 null
2024-06-06 GenAI Arena: An Open Evaluation Platform for Generative Models Dongfu Jiang et.al. 2406.04485 null
2024-06-06 MAIRA-2: Grounded Radiology Report Generation Shruthi Bannur et.al. 2406.04449 link
2024-06-06 DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs Lingchen Meng et.al. 2406.04334 null
2024-06-06 BLSP-Emo: Towards Empathetic Large Speech-Language Models Chen Wang et.al. 2406.03872 link
2024-06-05 Identification of Stone Deterioration Patterns with Large Multimodal Models Daniele Corradetti et.al. 2406.03207 link
2024-06-05 Exploiting LMM-based knowledge for image classification tasks Maria Tzelepi et.al. 2406.03071 null
2024-06-02 Multimodal Deep Learning for Low-Resource Settings: A Vector Embedding Alignment Approach for Healthcare Applications David Restrepo et.al. 2406.02601 null
2024-06-04 Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning Alex Jinpeng Wang et.al. 2406.02547 link
2024-06-04 Dealing with All-stage Missing Modality: Towards A Universal Model with Robust Reconstruction and Personalization Yunpeng Zhao et.al. 2406.01987 null
2024-06-03 Automatic Fused Multimodal Deep Learning for Plant Identification Alfreds Lapkovskis et.al. 2406.01455 link
2024-06-05 Pulmonary Embolism Mortality Prediction Using Multimodal Learning Based on Computed Tomography Angiography and Clinical Data Zhusi Zhong et.al. 2406.01302 null
2024-06-03 Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model Kezhen Chen et.al. 2406.00977 link
2024-06-02 Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient Zechu Li et.al. 2406.00681 null
2024-06-04 StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond Pengyuan Lyu et.al. 2405.21013 null
2024-05-31 Don't Buy it! Reassessing the Ad Understanding Abilities of Contrastive Multimodal Models A. Bavaresco et.al. 2405.20846 link
2024-06-17 Ovis: Structural Embedding Alignment for Multimodal Large Language Model Shiyin Lu et.al. 2405.20797 link
2024-05-31 Vision-Language Meets the Skeleton: Progressively Distillation with Cross-Modal Knowledge for 3D Action Representation Learning Yang Chen et.al. 2405.20606 link
2024-05-30 Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA Qianqi Yan et.al. 2405.20421 link
2024-05-30 Retrieval Augmented Structured Generation: Business Document Information Extraction As Tool Use Franz Louis Cesista et.al. 2405.20245 null
2024-05-31 Visual Attention Analysis in Online Learning Miriam Navarro et.al. 2405.20091 null
2024-05-30 MM-Lego: Modular Biomedical Multimodal Models with Minimal Fine-Tuning Konstantin Hemker et.al. 2405.19950 null
2024-05-30 Instruction-Guided Visual Masking Jinliang Zheng et.al. 2405.19783 link
2024-05-29 Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining Blake R. Duschatko et.al. 2405.19386 null
2024-06-09 LLMs Meet Multimodal Generation and Editing: A Survey Yingqing He et.al. 2405.19334 link
2024-05-29 Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare Hanwei Zhu et.al. 2405.19298 link
2024-05-31 Benchmarking and Improving Detail Image Caption Hongyuan Dong et.al. 2405.19092 link
2024-05-29 Topological Perspectives on Optimal Multimodal Embedding Spaces Abdul Aziz A. B et.al. 2405.18867 null
2024-05-29 Exploring Exotic Decays of the Higgs Boson to Multi-Photons at the LHC via Multimodal Learning Approaches A. Hammad et.al. 2405.18834 null
2024-05-28 The Evolution of Multimodal Model Architectures Shakti N. Wadekar et.al. 2405.17927 null
2024-05-28 Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment Xin Xiao et.al. 2405.17871 link
2024-05-28 Full-Stack Allreduce on Multi-Rail Networks Enda Yu et.al. 2405.17870 null
2024-05-28 MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance Yake Wei et.al. 2405.17730 link
2024-05-27 Matryoshka Multimodal Models Mu Cai et.al. 2405.17430 null
2024-05-27 XFormParser: A Simple and Effective Multimodal Multilingual Semi-structured Form Parser Xianfu Cheng et.al. 2405.17336 link
2024-05-28 LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding Haoyu Zhao et.al. 2405.17104 null
2024-05-27 Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning Zihua Zhao et.al. 2405.16996 link
2024-05-27 Multilingual Diversity Improves Vision-Language Representations Thao Nguyen et.al. 2405.16915 null
2024-05-26 Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs Mustafa Shukor et.al. 2405.16700 link
2024-05-25 How Well Do Deep Learning Models Capture Human Concepts? The Case of the Typicality Effect Siddhartha K. Vemuri et.al. 2405.16128 null
2024-05-24 ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models Chunjiang Ge et.al. 2405.15738 link
2024-05-24 Chain-of-Thought Prompting for Demographic Inference with Large Multimodal Models Yongsheng Yu et.al. 2405.15687 null
2024-05-24 M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models Hongyu Wang et.al. 2405.15638 link
2024-05-24 DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception Run Luo et.al. 2405.15232 link
2024-05-24 Shopping Queries Image Dataset (SQID): An Image-Enriched ESCI Dataset for Exploring Multimodal Learning in Product Search Marie Al Ghossein et.al. 2405.15190 link

(back to top)

Generative Weight Space Modeling

Publish Date Title Authors PDF Code
2024-12-19 DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation Wang Zhao et.al. 2412.15200 null
2024-12-18 On the principle of linearized stability for quasilinear evolution equations in time-weighted spaces Bogdan-Vasile Matioc et.al. 2412.13940 null
2024-12-17 On the Bäcklund transform and the stability of the line soliton of the KP-II equation on $\mathbb R^2$ Lorenzo Pompili et.al. 2412.12530 null
2024-12-13 On the embedding of weighted Sobolev spaces with applications to a planar nonlinear Schrödinger equation Antonio Azzolini et.al. 2412.10067 null
2024-12-12 Modified scattering for the cubic dispersion-managed NLS Jason Murphy et.al. 2412.09762 null
2024-12-12 LoRACLR: Contrastive Adaptation for Customization of Diffusion Models Enis Simsar et.al. 2412.09622 null
2024-12-11 Exploring superconformal Yang-Mills theories through matrix Bessel kernels Zoltan Bajnok et.al. 2412.08732 null
2024-12-09 Bilinear singular integral operators with kernels in weighted spaces Petr Honzík et.al. 2412.07014 null
2024-12-04 Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach Lingchen Sun et.al. 2412.03017 link
2024-11-21 Strong localization blurs criticality of time series for spreading phenomena on networks Juliane T. Moraes et.al. 2412.01842 null
2024-12-02 Geometric invariant theory and stretched Kostka quasi-polynomials Marc Besson et.al. 2412.01651 null
2024-11-29 Origin-Destination Demand Prediction: An Urban Radiation and Attraction Perspective Xuan Ma et.al. 2412.00167 null
2024-11-29 Rényi complexity in mean-field disordered systems Nina Javerzat et.al. 2411.19817 null
2024-11-28 An Extensive Evaluation of Factual Consistency in Large Language Models for Data-to-Text Generation Joy Mahapatra et.al. 2411.19203 null
2024-11-27 Task Arithmetic Through The Lens Of One-Shot Federated Learning Zhixu Tao et.al. 2411.18607 null
2024-11-25 Spectral properties of Lévy Fokker--Planck equations Hardy Chan et.al. 2411.16424 null
2024-11-20 Nonlinear orbital stability of stationary shock profiles for the Lax-Wendroff scheme Jean-François Coulombel et.al. 2411.13094 null
2024-11-26 Enhancing generalization in high energy physics using white-box adversarial attacks Franck Rothen et.al. 2411.09296 null
2024-11-11 Minimal nilpotent finite $W$-algebra and cuspidal module category of $\mathfrak{sp}_{2n}$ Genqiang Liu et.al. 2411.06768 null
2024-11-07 Well-Posedness and Regularity of the Heat Equation with Robin Boundary Conditions in the Two-Dimensional Wedge Marco Bravin et.al. 2411.04651 null
2024-11-04 SALSA: Soup-based Alignment Learning for Stronger Adaptation in RLHF Atoosa Chegini et.al. 2411.01798 null
2024-12-06 Modular Duality in Deep Learning Jeremy Bernstein et.al. 2410.21265 null
2024-10-26 MarDini: Masked Autoregressive Diffusion for Video Generation at Scale Haozhe Liu et.al. 2410.20280 null
2024-10-25 Four-parameter Mittag-Leffler functions and their associated coherent states Dušan Popov et.al. 2410.19462 null
2024-10-24 Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation Krzysztof Ociepa et.al. 2410.18565 null
2024-10-21 Two dimensional delta Bose gas in a weighted space Sudheesh Surendranath et.al. 2410.16550 null
2024-10-21 In Search of the Successful Interpolation: On the Role of Sharpness in CLIP Generalization Alireza Abdollahpoorrostam et.al. 2410.16476 link
2024-10-23 Universal approximation results for neural networks with non-polynomial activation function over non-compact domains Ariel Neufeld et.al. 2410.14759 null
2024-10-23 Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching Jie Peng et.al. 2410.14740 null
2024-10-16 Differential Shape Optimization with Image Representation for Photonic Design Zhaocheng Liu et.al. 2410.13074 null
2024-10-15 Scaling Laws for Multilingual Language Models Yifei He et.al. 2410.12883 null
2024-10-16 AutoSimTTF: A Fully Automatic Pipeline for Electric Field Simulation and Treatment Planning of Tumor Treating Fields Minmin Wang et.al. 2410.12196 null
2024-10-15 Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence Shangbin Feng et.al. 2410.11163 null
2024-10-14 Deep Linear Probe Generators for Weight Space Learning Jonathan Kahana et.al. 2410.10811 null
2024-10-14 Generating Model Parameters for Controlling: Parameter Diffusion for Controllable Multi-Task Recommendation Chenglei Shen et.al. 2410.10639 null
2024-10-14 MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer Minghao Zhu et.al. 2410.10589 link
2024-10-15 Regions of Level $\ell$ of Catalan/Semiorder-Type Arrangements Yanru Chen et.al. 2410.10198 null
2024-10-13 A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning Chen-Yu Liu et.al. 2410.09846 null
2024-10-11 Meta-Transfer Learning Empowered Temporal Graph Networks for Cross-City Real Estate Appraisal Weijia Zhang et.al. 2410.08947 null
2024-10-09 Efficient Weight-Space Laplace-Gaussian Filtering and Smoothing for Sequential Deep Learning Joanna Sliwa et.al. 2410.06800 null
2024-10-09 Revisiting Multi-Permutation Equivariance through the Lens of Irreducible Representations Yonatan Sverdlov et.al. 2410.06665 link
2024-10-08 Weighted Embeddings for Low-Dimensional Graph Representation Thomas Bläsius et.al. 2410.06042 null
2024-10-05 Computing ground states of Bose-Einstein condensation by normalized deep neural network Weizhu Bao et.al. 2410.05319 link
2024-10-07 Hyper-Representations: Learning from Populations of Neural Networks Konstantin Schürholt et.al. 2410.05107 link
2024-10-06 Integrable Modules of Map full Toroidal Lie Algebras Pradeep Bisht et.al. 2410.04495 null
2024-10-06 Global well-posedness for the defocusing 3D quadratic NLS in the sharp critical space Jia Shen et.al. 2410.04337 null
2024-10-05 Equivariant Neural Functional Networks for Transformers Viet-Hoang Tran et.al. 2410.04209 null
2024-10-15 Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces for Large Finetuned Models Theo Putterman et.al. 2410.04207 null
2024-10-04 Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks Ann Huang et.al. 2410.03972 null
2024-10-04 Autoregressive Moving-average Attention Mechanism for Time Series Forecasting Jiecheng Lu et.al. 2410.03159 link
2024-10-02 Composing Global Optimizers to Reasoning Tasks via Algebraic Objects in Neural Nets Yuandong Tian et.al. 2410.01779 link
2024-10-01 SynCOM: A tool for simulating coronal outflows Valmir Moraes Filho et.al. 2410.01004 null
2024-10-01 On the prime ideals of higher secant varieties of Veronese embeddings of small degrees Katsuhisa Furukawa et.al. 2410.00652 null
2024-09-30 Old Optimizer, New Norm: An Anthology Jeremy Bernstein et.al. 2409.20325 null
2024-09-27 Effects of Peierls phases in open linear chains Anselmo M. Marques et.al. 2409.18780 null
2024-09-27 Density of states in neural networks: an in-depth exploration of learning in parameter space Margherita Mele et.al. 2409.18683 null
2024-09-26 The time periodic problem for the Navier-Stokes equations in exterior domains in weighted spaces Reinhard Farwig et.al. 2409.17590 null
2024-09-25 Scalable Ensemble Diversification for OOD Generalization and Detection Alexander Rubinstein et.al. 2409.16797 null
2024-10-04 Lessons Learned from a Unifying Empirical Study of Parameter-Efficient Transfer Learning (PETL) in Visual Recognition Zheda Mai et.al. 2409.16434 link
2024-09-24 VascX Models: Model Ensembles for Retinal Vascular Analysis from Color Fundus Images Jose Vargas Quiros et.al. 2409.16016 link
2024-09-23 Efficient Large-Scale Quantum Optimization via Counterdiabatic Ansatz Jie Liu et.al. 2409.15055 null
2024-09-24 Weighted Approximation By Max-Product Generalized Exponential Sampling Series Satyaranjan Pradhan et.al. 2409.14884 null
2024-09-21 Weakly magnetized black holes in Einstein-ModMax theory Haryanto M. Siahaan et.al. 2409.13967 null
2024-09-18 Monomial Matrix Group Equivariant Neural Functional Networks Hoang V. Tran et.al. 2409.11697 link
2024-09-17 Existence of an extremal function of Sobolev critical embedding with an $α$ -homogeneous weight Petr Gurka et.al. 2409.11193 null
2024-09-16 Inferring stellar parameters and their uncertainties from high-resolution spectroscopy using invertible neural networks Nils Candebat et.al. 2409.10621 null
2024-09-13 Non-unitary Wightman CFTs and non-unitary vertex algebras Sebastiano Carpi et.al. 2409.08454 null
2024-09-12 Global well-posedness and scattering in weighted space for nonlinear Schrödinger equations below the Strauss exponent without gauge-invariance Masaki Kawamoto et.al. 2409.08432 null
2024-09-09 Fast gradient-free optimization of excitations in variational quantum eigensolvers Jonas Jäger et.al. 2409.05939 null
2024-09-06 SCARF: Scalable Continual Learning Framework for Memory-efficient Multiple Neural Radiance Fields Yuze Wang et.al. 2409.04482 null
2024-09-04 Federated Quantum-Train with Batched Parameter Generation Chen-Yu Liu et.al. 2409.02763 null
2024-09-16 Regret Analysis for Randomized Gaussian Process Upper Confidence Bound Shion Takeno et.al. 2409.00979 null
2024-08-30 Abstracted Gaussian Prototypes for One-Shot Concept Learning Chelsea Zou et.al. 2408.17251 link
2024-08-23 Emergence of global receptive fields capturing multipartite quantum correlations Oleg M. Sotnikov et.al. 2408.13033 null
2024-08-22 **Action of $\mathfrak{osp}(1 2n)$ on polynomials tensor $\mathbb{C}^{0 2n}$** Dwight Anderson Williams II et.al.
2024-08-19 Unimodal sequences and mixed false theta functions Kevin Allen et.al. 2408.09789 null
2024-08-16 Onsager-Machlup functional for stochastic lattice dynamical systems driven by time-varying noise Xinze Zhang et.al. 2408.08465 null
2024-08-10 Variational Inference Failures Under Model Symmetries: Permutation Invariant Posteriors for Bayesian Neural Networks Yoav Gelberg et.al. 2408.05496 null
2024-08-09 Quasilinear parabolic equations with superlinear nonlinearities in critical spaces Bogdan-Vasile Matioc et.al. 2408.05067 null
2024-08-08 A framework for generalizing toric inequalities for holographic entanglement entropy Ning Bao et.al. 2408.04741 null
2024-08-07 Counterfactuals and Uncertainty-Based Explainable Paradigm for the Automated Detection and Segmentation of Renal Cysts in Computed Tomography Images: A Multi-Center Study Zohaib Salahuddin et.al. 2408.03789 null
2024-08-05 BOTS-LM: Training Large Language Models for Setswana Nathan Brown et.al. 2408.02239 null
2024-08-02 Conditional LoRA Parameter Generation Xiaolong Jin et.al. 2408.01415 null
2024-08-01 Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization Róisín Luo et.al. 2408.00923 null
2024-07-31 Semantic Codebook Learning for Dynamic Recommendation Models Zheqi Lv et.al. 2408.00123 null
2024-07-29 Tensor product weight modules over the affine-Virasoro algebra Qiu-Fan Chen et.al. 2407.19844 null
2024-07-24 Generalized Hilbert operators acting on weighted spaces of holomorphic functions with sup-norms María J. Beltrán-Meneu et.al. 2407.17646 null
2024-07-24 Generalized Ordinal Priority Approach for Multi-Attribute Decision-Making under Incomplete Preference Information Renlong Wang et.al. 2407.17099 null
2024-07-22 WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation Zirui Shao et.al. 2407.15502 link
2024-07-18 FSP-Laplace: Function-Space Priors for the Laplace Approximation in Bayesian Deep Learning Tristan Cinquin et.al. 2407.13711 null
2024-07-19 Parameter Generation of Quantum Approximate Optimization Algorithm with Diffusion Model Fanxu Meng et.al. 2407.12242 null
2024-07-24 Effect Heterogeneity with Earth Observation in Randomized Controlled Trials: Exploring the Role of Data, Model, and Evaluation Metric Choice Connor T. Jerzak et.al. 2407.11674 link
2024-07-15 Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion Yongyuan Liang et.al. 2407.10973 null
2024-07-16 The well-posedness of generalized nonlinear wave equations on the lattice graph Bobo Hua et.al. 2407.09815 null
2024-07-15 Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Cross-Regularization Jinlong Li et.al. 2407.08374 null
2024-07-09 Fine-Tuning Linear Layers Only Is a Simple yet Effective Way for Task Arithmetic Ruochen Jin et.al. 2407.07089 link
2024-07-04 Recovering Initial States in Semilinear Parabolic Problems from Time-Averages Lina Sophie Schmitz et.al. 2407.03829 null
2024-07-01 A quantum deformation of the ${\mathcal N}=2$ superconformal algebra H. Awata et.al. 2407.00901 null
2024-06-24 WARP: On the Benefits of Weight Averaged Rewarded Policies Alexandre Ramé et.al. 2406.16768 null
2024-06-24 Improving robustness to corruptions with multiplicative weight perturbations Trung Trinh et.al. 2406.16540 link
2024-06-21 Determination of certain mod $p$ Galois representations using local constancy Abhik Ganguli et.al. 2406.15600 null
2024-06-21 Elliptic analysis on collapsing gravitational instantons modelled using the Gibbons-Hawking ansatz Willem Adriaan Salm et.al. 2406.15008 null
2024-06-20 MEAT: Median-Ensemble Adversarial Training for Improving Robustness and Generalization Zhaozhe Hu et.al. 2406.14259 link
2024-06-18 From Instance Training to Instruction Learning: Task Adapters Generation from Instructions Huanxuan Liao et.al. 2406.12382 link
2024-06-17 Kaniadakis entropy in extreme gravitational and cosmological environments: a review on the state-of-the-art and future prospects Giuseppe Gaetano Luciano et.al. 2406.11373 null
2024-06-16 Analysis and approximation of elliptic problems with Uhlenbeck structure in convex polytopes Tadele Mengesha et.al. 2406.10762 null
2024-06-14 Towards Scalable and Versatile Weight Space Learning Konstantin Schürholt et.al. 2406.09997 link
2024-06-13 Interpreting the Weight Space of Customized Diffusion Models Amil Dravid et.al. 2406.09413 link
2024-06-12 Diffusion Soup: Model Merging for Text-to-Image Diffusion Models Benjamin Biggs et.al. 2406.08431 null
2024-06-24 Cartan monopoles Andrei Smilga et.al. 2406.06042 null
2024-06-08 Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models Minho Park et.al. 2406.05432 link
2024-06-06 Regularized KL-Divergence for Well-Defined Function-Space Variational Inference in Bayesian neural networks Tristan Cinquin et.al. 2406.04317 null
2024-06-06 A characterization of $(μ,ν)$ -dichotomies via admissibility Lucas Backes et.al. 2406.04126 null
2024-06-05 Reproducing Kernel Thesis of Hankel Operators on Weighted Hardy Spaces Ana Čolović et.al. 2406.03106 null
2024-05-21 Backpropogation-Free Multi-modal On-Device Model Adaptation via Cloud-Device Collaboration Wei Ji et.al. 2406.01601 null
2024-05-29 Thermodynamics of the most generalized form of Holographic Dark Energy and some particular cases with Corrected Entropies Sanghati Saha et.al. 2405.20783 null
2024-06-20 The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof Derek Lim et.al. 2405.20231 link
2024-05-28 Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography Jie Liu et.al. 2405.18356 link
2024-05-28 $C^2M^3$ : Cycle-Consistent Multi-Model Merging Donato Crisostomi et.al. 2405.17897 link
2024-05-27 Smoothing effects and extinction in finite time for fractional fast diffusions on Riemannian manifolds Elvise Berchio et.al. 2405.17126 null
2024-05-31 FedSheafHN: Personalized Federated Learning on Graph-structured Data Wenfei Liang et.al. 2405.16056 null
2024-05-27 HyperInterval: Hypernetwork approach to training weight interval regions in continual learning Patryk Krukowski et.al. 2405.15444 link
2024-05-23 Scalable Optimization in the Modular Norm Tim Large et.al. 2405.14813 link
2024-06-16 A refined Weyl character formula for comodules on $\operatorname{GL}_{2,A}$ Helge Øystein Maakestad et.al. 2405.09210 null
2024-05-13 Localizing Task Information for Improved Model Merging and Compression Ke Wang et.al. 2405.07813 link
2024-05-13 $α$ VIL: Learning to Leverage Auxiliary Tasks for Multitask Learning Rafael Kourdis et.al. 2405.07769 null
2024-05-12 Approximation by a new sequence of operators involving Laguerre polynomials Kapil Kumar et.al. 2405.07228 null
2024-05-06 Swarm intelligence for full Stokes dynamic imaging reconstruction of interferometric data Alejandro Mus et.al. 2405.03330 null
2024-05-04 Large Deviation Principles of Invariant Measures of Stochastic Reaction-Diffusion Lattice Systems Bixiang Wang et.al. 2405.02720 null
2024-05-03 The Immersed Inextensible Interface Problem in 2D Stokes Flow Eduardo García-Juárez et.al. 2405.02446 null
2024-05-02 Customizing Text-to-Image Models with a Single Image Pair Maxwell Jones et.al. 2405.01536 null
2024-04-25 Robust Fine-tuning for Pre-trained 3D Point Cloud Models Zhibo Zhang et.al. 2404.16422 null
2024-04-23 The Geometry of the Set of Equivalent Linear Neural Networks Jonathan Richard Shewchuk et.al. 2404.14855 null
2024-04-24 Nonexistence of solutions to parabolic problems with a potential on weighted graphs Dario D. Monticelli et.al. 2404.12058 null
2024-04-17 On the relaxation to equilibrium of a quantum oscillator interacting with a radiation field Pierre-A. Vuillermot et.al. 2404.11329 null
2024-04-15 Higher-curvature gravity in AdS $_3$, holographic $c$ -theorems and black hole microstates Mariano Chernicoff et.al. 2404.10128 null
2024-04-16 Asymptotic-preserving approximations for stochastic incompressible viscous fluids and SPDEs on graph Jianbo Cui et.al. 2404.09168 null
2024-04-09 Perspective on Physical Interpretations of Rényi Entropy in Statistical Mechanics Misaki Ozawa et.al. 2404.06436 null
2024-04-09 A gluing construction of singular solutions for a fully non-linear equation in conformal geometry María Fernanda Espinal et.al. 2404.05965 null
2024-04-05 Dissipative Euler flows originating from circular vortex filaments Francisco Gancedo et.al. 2404.04250 null
2024-04-05 Macdonald characters from a new formula for Macdonald polynomials Houcine Ben Dali et.al. 2404.03904 null
2024-04-04 Fundamental inequalities for the iterated Fourier-cosine convolution with Gaussian weight and its application Nguyen Thi Hong Phuong et.al. 2404.03609 null
2024-03-29 Embracing Unknown Step by Step: Towards Reliable Sparse Training in Real World Bowen Lei et.al. 2403.20047 link
2024-03-28 Model Stock: All we need is just a few fine-tuned models Dong-Hwan Jang et.al. 2403.19522 link
2024-03-26 A location Invariant Statistic-Based Consistent Estimation Method for Three-Parameter Generalized Exponential Distribution Kiran Prajapat et.al. 2403.17609 null
2024-06-03 FissionFusion: Fast Geometric Generation and Hierarchical Souping for Medical Image Analysis Santosh Sanjeev et.al. 2403.13341 link
2024-06-18 Learning Useful Representations of Recurrent Neural Network Weight Matrices Vincent Herrmann et.al. 2403.11998 link
2024-03-16 Function-space Parameterization of Neural Networks for Sequential Learning Aidan Scannell et.al. 2403.10929 link
2024-03-14 Imprints of Barrow-Tsallis Cosmology in Primordial Gravitational Waves Petr Jizba et.al. 2403.09797 null
2024-03-14 Eigenvariety for partially classical Hilbert modular forms Mladen Dimitrov et.al. 2403.09784 null
2024-03-12 The solenoidal Heisenberg Virasoro algebra and its simple weight modules Boujemaa Agrebaoui et.al. 2403.07381 null
2024-03-10 FrameQuant: Flexible Low-Bit Quantization for Transformers Harshavardhan Adepu et.al. 2403.06082 link
2024-03-06 The solenoidal Virasoro algebra and its simple weight modules Boujemaa Agrebaoui et.al. 2403.03753 null
2024-03-05 Tensor Decomposition-based Time Varying Channel Estimation for mmWave MIMO-OFDM Systems Ruizhe Wang et.al. 2403.02942 null
2024-03-05 Neural Redshift: Random Networks are not Random Functions Damien Teney et.al. 2403.02241 null
2024-03-04 Tiny fluctuations of the averaging process around its degenerate steady state Federico Sau et.al. 2403.02032 null
2024-03-15 Training-Free Pretrained Model Merging Zhengqi Xu et.al. 2403.01753 link
2024-04-22 HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances Supreeth Narasimhaswamy et.al. 2403.01693 null
2024-03-13 TOOLVERIFIER: Generalization to New Tools via Self-Verification Dheeraj Mekala et.al. 2402.14158 link
2024-02-21 Computing Tangent Spaces to Eigenvarieties James Rawson et.al. 2402.13799 null
2024-05-28 Neural Network Parameter Diffusion Kai Wang et.al. 2402.13144 link
2024-02-19 Exponential attractors for a nonlocal delayed reaction-diffusion equation on an unbounded domain Wenjie Hu et.al. 2402.11856 null
2024-02-18 Discrete Neural Algorithmic Reasoning Gleb Rodionov et.al. 2402.11628 link
2024-02-17 Uncertainty Quantification of Graph Convolution Neural Network Models of Evolving Processes Jeremiah Hauth et.al. 2402.11179 null
2024-06-06 Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning Tuc Nguyen et.al. 2402.10639 null
2024-02-14 TAI-GAN: A Temporally and Anatomically Informed Generative Adversarial Network for early-to-late frame conversion in dynamic cardiac PET inter-frame motion correction Xueqi Guo et.al. 2402.09567 null
2024-02-14 The cohomology of $p$ -adic Deligne-Luszitg schemes of Coxeter type Alexander B. Ivanov et.al. 2402.09017 null
2024-02-09 The Asymptotic Structure of Cosmological Integrals Paolo Benincasa et.al. 2402.06558 null
2024-02-07 Universal Neural Functionals Allan Zhou et.al. 2402.05232 link
2024-02-06 Maximal regularity and optimal control for a non-local Cahn-Hilliard tumour growth model Matteo Fornoni et.al. 2402.04204 null
2024-02-06 Improved Generalization of Weight Space Networks via Augmentations Aviv Shamsian et.al. 2402.04081 link
2024-02-02 Training-time Neuron Alignment through Permutation Subspace for Improving Linear Mode Connectivity and Model Fusion Zexi Li et.al. 2402.01342 null
2024-02-01 Understanding Neural Network Systems for Image Analysis using Vector Spaces and Inverse Maps Rebecca Pattichis et.al. 2402.00261 link
2024-01-26 Do deep neural networks utilize the weight space efficiently? Onur Can Koyun et.al. 2401.16438 null
2024-01-22 On strong growth conditions for weighted spaces of entire functions Gerhard Schindl et.al. 2401.14330 null
2024-01-24 Task structure and nonlinearity jointly determine learned representational geometry Matteo Alleman et.al. 2401.13558 null
2024-01-25 Sparse Domination of Singular Bilinear Forms on Non-Homogeneous spaces Paco Villarroya et.al. 2401.13130 null
2024-01-22 WARM: On the Benefits of Weight Averaged Reward Models Alexandre Ramé et.al. 2401.12187 null
2024-01-17 Cesàro operators associated with Borel measures acting on weighted spaces of holomorphic functions with sup-norm Maria José Beltrán Meneu et.al. 2401.09406 null
2024-01-15 Singular fractal dimension at periodicity cascades in parameters spaces Carlos E. P. Abreu et.al. 2401.07648 null
2024-01-17 Computing Fringe Presentations of Multigraded Persistence Modules Fabian Lenzen et.al. 2401.06008 null
2024-01-10 Grimoire is All You Need for Enhancing Large Language Models Ding Chen et.al. 2401.03385 link
2024-03-26 Artificial Intelligence for Operations Research: Revolutionizing the Operations Research Process Zhenan Fan et.al. 2401.03244 null
2023-12-31 A Compact Representation for Bayesian Neural Networks By Removing Permutation Symmetry Tim Z. Xiao et.al. 2401.00611 link
2023-12-28 Fractional non-homogeneous counting process Nick Laskin et.al. 2312.17389 null
2023-12-28 Some unimodal sequences of Kronecker coefficients Alimzhan Amanov et.al. 2312.17054 null
2023-12-24 The Vlasov-Maxwell-Boltzmann/Landau system with polynomial perturbation near Maxwellian Chuqi Cao et.al. 2312.15510 null
2023-12-22 Emage: Non-Autoregressive Text-to-Image Generation Zhangyin Feng et.al. 2312.14988 null
2023-12-21 Hypercyclic shifts on lattice graphs Anton Baranov et.al. 2312.13934 null
2023-12-21 Scattering for 2d semi-relativistic Hartree equations with short range potential Changhun Yang et.al. 2312.13606 null
2023-12-21 Entropic Inflation in Presence of Scalar Field Sergei D. Odintsov et.al. 2312.13587 null
2023-12-30 Time is Encoded in the Weights of Finetuned Language Models Kai Nylund et.al. 2312.13401 link
2023-12-14 Efficient momentum space approach to superconductivity in quasiperiodic systems Mao Yoshii et.al. 2312.09124 null
2023-12-13 Best one-sided algebraic approximation by average modulus Raheam A. Al-Saphory et.al. 2312.08407 null
2023-12-19 Well-Posedness of Quasilinear Parabolic Equations in Time-Weighted Spaces Bogdan Matioc et.al. 2312.07974 null
2023-12-12 Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models Arnav Chavan et.al. 2312.07046 link
2023-12-11 Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks MohammadReza Davari et.al. 2312.06795 null
2023-12-08 Stoichiometry preservation and generalization of Bilger mixture fraction for non-premixed combustion with differential molecular diffusion Haifeng Wang et.al. 2312.05204 null
2023-12-01 New polyconvolution product for Fourier-cosine and Laplace integral operators and their applications Trinh Tuan et.al. 2312.00764 null
2023-11-30 Modelling Einstein cluster using Einasto profile Ritwik Acharyya et.al. 2311.18622 null
2023-11-27 Extraction of the microscopic properties of quasi-particles using deep neural networks Olga Soloveva et.al. 2311.15984 null
2024-01-24 Deep Latent Force Models: ODE-based Process Convolutions for Bayesian Deep Learning Thomas Baldwin-McDonald et.al. 2311.14828 null

(back to top)

Data Distillation

Publish Date Title Authors PDF Code
2024-10-25 FLiP: Privacy-Preserving Federated Learning based on the Principle of Least Privileg ShiMao Xu et.al. 2410.19548 null
2024-10-25 SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models Jahyun Koo et.al. 2410.19503 null
2024-10-24 AlignCap: Aligning Speech Emotion Captioning to Human Preferences Ziqi Liang et.al. 2410.19134 null
2024-10-24 High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws M. Emrullah Ildiz et.al. 2410.18837 null
2024-10-24 Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data Anup Shirgaonkar et.al. 2410.18588 null
2024-10-24 SIKeD: Self-guided Iterative Knowledge Distillation for mathematical reasoning Shivam Adarsh et.al. 2410.18574 link
2024-10-23 ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams Srija Anand et.al. 2410.17901 null
2024-10-23 Towards Active Participant-Centric Vertical Federated Learning: Some Representations May Be All You Need Jon Irureta et.al. 2410.17648 null
2024-10-23 Towards Effective Data-Free Knowledge Distillation via Diverse Diffusion Augmentation Muquan Li et.al. 2410.17606 link
2024-10-23 Physics-driven AI for Channel Estimation in Cellular Network Xiaoqian Qi et.al. 2410.17525 null
2024-10-22 MiniPLM: Knowledge Distillation for Pre-Training Language Models Yuxian Gu et.al. 2410.17215 link
2024-10-22 Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios Kai Wang et.al. 2410.17193 link
2024-10-22 CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare Nicholas I-Hsien Kuo et.al. 2410.16872 null
2024-10-22 AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models Yongjian Wu et.al. 2410.16820 link
2024-10-22 SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation Jing-Jing Li et.al. 2410.16665 null
2024-10-21 Pre-training Distillation for Large Language Models: A Design Space Exploration Hao Peng et.al. 2410.16215 null
2024-10-18 Interpreting Microbiome Relative Abundance Data Using Symbolic Regression Swagatam Haldar et.al. 2410.16109 link
2024-10-21 Are Large-scale Soft Labels Necessary for Large-scale Dataset Distillation? Lingao Xiao et.al. 2410.15919 link
2024-10-21 Model Mimic Attack: Knowledge Distillation for Provably Transferable Adversarial Examples Kirill Lukyanov et.al. 2410.15889 null
2024-10-20 Hybrid Memory Replay: Blending Real and Distilled Data for Class Incremental Learning Jiangtao Kong et.al. 2410.15372 null
2024-10-20 GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning Haiwen Diao et.al. 2410.15266 link
2024-10-19 LLaVA-Ultra: Large Chinese Language and Vision Assistant for Ultrasound Xuechen Guo et.al. 2410.15074 null
2024-10-19 Improving Pronunciation and Accent Conversion through Knowledge Distillation And Synthetic Ground-Truth from Native TTS Tuan Nam Nguyen et.al. 2410.14997 null
2024-10-17 CAKD: A Correlation-Aware Knowledge Distillation Framework Based on Decoupling Kullback-Leibler Divergence Zao Zhang et.al. 2410.14741 null
2024-10-18 Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation Shuai Zhao et.al. 2410.14425 link
2024-10-18 Preview-based Category Contrastive Learning for Knowledge Distillation Muhe Ding et.al. 2410.14143 null
2024-10-17 Leveraging Fine-Tuned Language Models for Efficient and Accurate Smart Contract Auditing Zhiyuan Wei et.al. 2410.13918 link
2024-10-17 GDeR: Safeguarding Efficiency, Balancing, and Robustness via Prototypical Graph Pruning Guibin Zhang et.al. 2410.13761 link
2024-10-17 An Active Learning Framework for Inclusive Generation by Large Language Models Sabit Hassan et.al. 2410.13641 null
2024-10-18 Towards Satellite Non-IID Imagery: A Spectral Clustering-Assisted Federated Learning Approach Luyao Zou et.al. 2410.13602 null
2024-10-17 Enhancing Dataset Distillation via Label Inconsistency Elimination and Learning Pattern Refinement Chuhao Zhou et.al. 2410.13311 link
2024-10-18 Cyber Attacks Prevention Towards Prosumer-based EV Charging Stations: An Edge-assisted Federated Prototype Knowledge Distillation Approach Luyao Zou et.al. 2410.13260 null
2024-10-16 TAS: Distilling Arbitrary Teacher and Student via a Hybrid Assistant Guopeng Li et.al. 2410.12342 null
2024-10-16 Optimizing YOLOv5s Object Detection through Knowledge Distillation algorithm Guanming Huang et.al. 2410.12259 null
2024-10-16 TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration Yiwei Guo et.al. 2410.12183 link
2024-10-17 SAM-Guided Masked Token Prediction for 3D Scene Understanding Zhimin Chen et.al. 2410.12158 null
2024-10-15 MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router Yanyue Xie et.al. 2410.12013 null
2024-10-15 Breaking Modality Gap in RGBT Tracking: Coupled Knowledge Distillation Andong Lu et.al. 2410.11586 link
2024-10-15 Learning from Imperfect Data: Towards Efficient Knowledge Distillation of Autoregressive Language Models for Text-to-SQL Qihuang Zhong et.al. 2410.11371 null
2024-10-15 Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling Wenda Xu et.al. 2410.11325 null
2024-10-14 BrainMVP: Multi-modal Vision Pre-training for Brain Image Analysis using Multi-parametric MRI Shaohao Rui et.al. 2410.10604 null
2024-10-14 ROSAR: An Adversarial Re-Training Framework for Robust Side-Scan Sonar Object Detection Martin Aubard et.al. 2410.10554 link
2024-10-14 Temperature-Centric Investigation of Speculative Decoding with Knowledge Distillation Siru Ouyang et.al. 2410.10141 null
2024-10-14 REHRSeg: Unleashing the Power of Self-Supervised Super-Resolution for Resource-Efficient 3D MRI Segmentation Zhiyun Song et.al. 2410.10097 null
2024-10-15 Self-Data Distillation for Recovering Quality in Pruned Large Language Models Vithursan Thangarasa et.al. 2410.09982 null
2024-10-13 Generalized Group Data Attribution Dan Ley et.al. 2410.09940 null
2024-10-12 Distilling Invariant Representations with Dual Augmentation Nikolaos Giakoumoglou et.al. 2410.09474 null
2024-10-12 Declarative Knowledge Distillation from Large Language Models for Visual Question Answering Datasets Thomas Eiter et.al. 2410.09428 link
2024-10-15 Transforming In-Vehicle Network Intrusion Detection: VAE-based Knowledge Distillation Meets Explainable AI Muhammet Anil Yagiz et.al. 2410.09043 null
2024-10-11 Mentor-KD: Making Small Language Models Better Multi-step Reasoners Hojae Lee et.al. 2410.09037 link
2024-10-11 Contrastive Knowledge Distillation for Robust Multimodal Sentiment Analysis Zhongyi Sang et.al. 2410.08692 null
2024-10-11 DistDD: Distributed Data Distillation Aggregation through Gradient Matching Peiran Wang et.al. 2410.08665 null
2024-10-11 GAI-Enabled Explainable Personalized Federated Semi-Supervised Learning Yubo Peng et.al. 2410.08634 null
2024-10-11 Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both Abhijnan Nath et.al. 2410.08458 null
2024-10-10 What is Left After Distillation? How Knowledge Transfer Impacts Fairness and Bias Aida Mohammadshahi et.al. 2410.08407 null
2024-10-10 A Lightweight Target-Driven Network of Stereo Matching for Inland Waterways Jing Su et.al. 2410.07915 null
2024-10-10 SNN-PAR: Energy Efficient Pedestrian Attribute Recognition via Spiking Neural Networks Haiyang Wang et.al. 2410.07857 link
2024-10-12 Relational Diffusion Distillation for Efficient Image Generation Weilun Feng et.al. 2410.07679 link
2024-10-10 Teddy: Efficient Large-Scale Dataset Distillation via Taylor-Approximated Matching Ruonan Yu et.al. [2410.07579](http://arxiv.org/abs/2410.07

About

🎓Automatically Update CV Papers Daily using Github Actions (Update Every 12th hours)

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%