paper-reading-group

Notes for papers presented during our paper reading sessions.

Papers:

Tuning Frequency Bias of State Space Models [paper] [slides]
Efficiently Modeling Long Sequences with Structured State Spaces [paper] [slides]
Simplified State Space Layers for Sequence Modeling [paper] [slides]
Mamba: Linear-Time Sequence Modeling with Selective State Spaces [paper] [slides]
Representation Learning on Hyper-Relational and Numeric Knowledge Graphs with Transformers [paper] [slides]
Chemical-Reaction-Aware Molecule Representation Learning [paper] [slides]
Towards Foundation Models for Knowledge Graph Reasoning [paper]
Pre-training Molecular Graph Representation with 3D Geometry [paper] [slides]
AdaProp: Learning Adaptive Propagation for Graph Neural Network based Knowledge Graph Reasoning [paper] [slides]
Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing [paper] [slides]
Translation between Molecules and Natural Language [paper] [slides]
SpecFormer: Spectral Graph Neural Networks Meet Transformers [paper] [slides]
FusionRetro: Molecule Representation Fusion via In-Context Learning for Retrosynthetic Planning [paper] [slides]
Context-Enriched Molecule Representations Improve Few-Shot Drug Discovery [paper]
Cooperative Graph Neural Networks [paper]
Bi-Level Contrastive Learning for Knowledge Enhanced Molecule Representations [paper] [slides]
Equivariant Subgraph Aggregation Networks [paper] [slides]
Prodigy: Enabling In-context Learning Over Graphs [paper] [slides]
Learning Rule-Induced Subgraph Representations for Inductive Relation Prediction [paper] [slides]
Knowledge graph-enhanced molecular contrastive learning with functional prompt [paper] [slides]
Multi-Grained Multimodal Interaction Network for Entity Linking [paper] [slides]
Enhancing Molecular Property Prediction with Auxiliary Learning and Task-Specific Adaptation [paper] [slides]
PolyGCL: Graph Contrastive Learning via Learnable Spectral Polynomial Filters [paper] [slides]
Less is More: One-Shot-Subgraph Link Prediction on Large-Scale Knowledge Graph [paper] [slides]
Vision Transformers Need Registers [paper]
Denoising Diffusion Probabilistic Models [paper]
Mitigating Memorisation in Diffusion Models [paper]
Interpreting CLIP’s Image Representation via Text-Based Decomposition [paper]
Diffuse, Attend, and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion [paper]
Do Transformers Really Perform Bad for Graph Representation? (Graphormer) [paper] [slides]
Rank-N-Contrast [paper] [slides]
EGRU [paper]
Deep Bidirectional Language-Knowledge Graph Pretraining (DRAGON) [paper] [slides]
Resurrecting Recurrent Neural Networks for Long Sequences (LRU) [paper]
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models [paper]
Linguistic Binding in Diffusion Models [paper]
Evolutionary Optimization of Model Merging Recipes [paper] [slides]
Editing Models with Task Arithmetic [paper] [slides]
WARM: On the Benefits of Weight Averaged Reward Models [paper] [slides]
Rewarded Soups: Towards Pareto-Optimal Alignment by Interpolating Weights Fine-Tuned on Diverse Rewards [paper] [slides]
Arcee’s MergeKit: A Toolkit for Merging Large Language Models [paper] [slides]
TIES-Merging: Resolving Interference When Merging Models [paper] [slides]
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition [paper] [slides]
ZipIt! Merging Models from Different Tasks without Training [paper] [slides]
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization [paper] [slides]
Text2Mol: Cross-Modal Molecule Retrieval with Natural Language Queries [paper] [slides]
MOPO: Model-based Offline Policy Optimization [paper] [notes]
DETR: End-to-End Object Detection with Transformers [paper] [notes]
Efficient Adaptation for End-to-End Vision-Based Robotic Manipulation [paper] [notes]
PipeDream: Generalized Pipeline Parallelism for DNN Training [paper] [notes]
Lottery Ticket Hypothesis [paper] [notes]
Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies [paper] [notes]
Designing and Interpreting Probes with Control Tasks [paper] [notes]
What Does BERT Learn about the Structure of Language? [paper] [notes]
GNN [notes]
Universal Adversarial Triggers [paper] [notes]
Confidence-Aware Learning for Deep Neural Networks (CRL) [paper] [notes]
A How-to-Model Guide for Neuroscience [paper] [notes]
Neural ODEs [paper] [notes]
Model based Reinforcement Learning [notes]
Learning to describe scenes with programs [paper] [notes]
DeepSynth: Automata Synthesis for Automatic Task Segmentation in RL [paper] [notes]
Model free conventions in MARL with Heterogeneous Preferences [paper] [notes]
Accelerating Reinforcement Learning with Learned Skill Priors [paper] [notes]
Progressive Domain Adaptation for Object Detection [paper] [notes]
Convolutional Networks with Adaptive inference Graphs [paper] [notes]