paper-reading-group

Notes for papers presented during our paper reading sessions.

Papers:

  1. Efficiently Modeling Long Sequences with Structured State Spaces [paper] [slides]
  2. Simplified State Space Layers for Sequence Modeling [paper] [slides]
  3. Mamba: Linear-Time Sequence Modeling with Selective State Spaces [paper] [slides]
  4. Representation Learning on Hyper-Relational and Numeric Knowledge Graphs with Transformers [paper] [slides]
  5. Chemical-Reaction-Aware Molecule Representation Learning [paper] [slides]
  6. Towards Foundation Models for Knowledge Graph Reasoning [paper]
  7. Pre-training Molecular Graph Representation with 3D Geometry [paper] [slides]
  8. AdaProp: Learning Adaptive Propagation for Graph Neural Network based Knowledge Graph Reasoning [paper] [slides]
  9. Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing [paper] [slides]
  10. Translation between Molecules and Natural Language [paper] [slides]
  11. SpecFormer: Spectral Graph Neural Networks Meet Transformers [paper] [slides]
  12. FusionRetro: Molecule Representation Fusion via In-Context Learning for Retrosynthetic Planning [paper] [slides]
  13. Context-Enriched Molecule Representations Improve Few-Shot Drug Discovery [paper]
  14. Cooperative Graph Neural Networks [paper]
  15. Bi-Level Contrastive Learning for Knowledge Enhanced Molecule Representations [paper] [slides]
  16. Equivariant Subgraph Aggregation Networks [paper] [slides]
  17. Prodigy: Enabling In-context Learning Over Graphs [paper] [slides]
  18. Learning Rule-Induced Subgraph Representations for Inductive Relation Prediction [paper] [slides]
  19. Knowledge graph-enhanced molecular contrastive learning with functional prompt [paper] [slides]
  20. Multi-Grained Multimodal Interaction Network for Entity Linking [paper] [slides]
  21. Enhancing Molecular Property Prediction with Auxiliary Learning and Task-Specific Adaptation [paper] [slides]
  22. PolyGCL: Graph Contrastive Learning via Learnable Spectral Polynomial Filters [paper] [slides]
  23. Less is More: One-Shot-Subgraph Link Prediction on Large-Scale Knowledge Graph [paper] [slides]
  24. Vision Transformers Need Registers [paper]
  25. Denoising Diffusion Probabilistic Models [paper]
  26. Mitigating Memorisation in Diffusion Models [paper]
  27. Interpreting CLIP’s Image Representation via Text-Based Decomposition [paper]
  28. Diffuse, Attend, and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion [paper]
  29. Do Transformers Really Perform Bad for Graph Representation? (Graphormer) [paper] [slides]
  30. Rank-N-Contrast [paper] [slides]
  31. EGRU [paper]
  32. Deep Bidirectional Language-Knowledge Graph Pretraining (DRAGON) [paper] [slides]
  33. Resurrecting Recurrent Neural Networks for Long Sequences (LRU) [paper]
  34. Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models [paper]
  35. Linguistic Binding in Diffusion Models [paper]
  36. Evolutionary Optimization of Model Merging Recipes [paper] [slides]
  37. Editing Models with Task Arithmetic [paper] [slides]
  38. WARM: On the Benefits of Weight Averaged Reward Models [paper] [slides]
  39. Rewarded Soups: Towards Pareto-Optimal Alignment by Interpolating Weights Fine-Tuned on Diverse Rewards [paper] [slides]
  40. Arcee’s MergeKit: A Toolkit for Merging Large Language Models [paper] [slides]
  41. TIES-Merging: Resolving Interference When Merging Models [paper] [slides]
  42. LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition [paper] [slides]
  43. ZipIt! Merging Models from Different Tasks without Training [paper] [slides]
  44. Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization [paper] [slides]
  45. Text2Mol: Cross-Modal Molecule Retrieval with Natural Language Queries [paper] [slides]
  46. MOPO: Model-based Offline Policy Optimization [paper] [notes]
  47. DETR: End-to-End Object Detection with Transformers [paper] [notes]
  48. Efficient Adaptation for End-to-End Vision-Based Robotic Manipulation [paper] [notes]
  49. PipeDream: Generalized Pipeline Parallelism for DNN Training [paper] [notes]
  50. Lottery Ticket Hypothesis [paper] [notes]
  51. Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies [paper] [notes]
  52. Designing and Interpreting Probes with Control Tasks [paper] [notes]
  53. What Does BERT Learn about the Structure of Language? [paper] [notes]
  54. GNN [notes]
  55. Universal Adversarial Triggers [paper] [notes]
  56. Confidence-Aware Learning for Deep Neural Networks (CRL) [paper] [notes]
  57. A How-to-Model Guide for Neuroscience [paper] [notes]
  58. Neural ODEs [paper] [notes]
  59. Model based Reinforcement Learning [notes]
  60. Learning to describe scenes with programs [paper] [notes]
  61. DeepSynth: Automata Synthesis for Automatic Task Segmentation in RL [paper] [notes]
  62. Model free conventions in MARL with Heterogeneous Preferences [paper] [notes]
  63. Accelerating Reinforcement Learning with Learned Skill Priors [paper] [notes]
  64. Progressive Domain Adaptation for Object Detection [paper] [notes]
  65. Convolutional Networks with Adaptive inference Graphs [paper] [notes]