<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.2">Jekyll</generator><link href="https://ai3-genai4sci.github.io/lab-website/feed.xml" rel="self" type="application/atom+xml" /><link href="https://ai3-genai4sci.github.io/lab-website/" rel="alternate" type="text/html" /><updated>2025-04-10T08:56:28+00:00</updated><id>https://ai3-genai4sci.github.io/lab-website/feed.xml</id><title type="html">AI3-GenAI4Sci</title><subtitle>An engaging 1-3 sentence description of your lab.</subtitle><entry><title type="html">GroupPost20250407-WorkAlign</title><link href="https://ai3-genai4sci.github.io/lab-website/2025/04/07/GroupPost-WorkAlign.html" rel="alternate" type="text/html" title="GroupPost20250407-WorkAlign" /><published>2025-04-07T00:00:00+00:00</published><updated>2025-04-10T08:53:51+00:00</updated><id>https://ai3-genai4sci.github.io/lab-website/2025/04/07/GroupPost-WorkAlign</id><content type="html" xml:base="https://ai3-genai4sci.github.io/lab-website/2025/04/07/GroupPost-WorkAlign.html"><![CDATA[<p>Internal Report - Meeting Minutes Link: <a href="https://q47cvaon6v.feishu.cn/wiki/XDaKw3HtdihzLdkf0SRcMmC2nAe?from=from_copylink">feishu docs</a>. To view or edit, please request access.</p>]]></content><author><name>All team members</name></author><category term="biology" /><category term="generative-model" /><category term="LLM" /><summary type="html"><![CDATA[Internal Report - Meeting Minutes Link: feishu docs. To view or edit, please request access.]]></summary></entry><entry><title type="html">GroupPost20250325-WorkAlign</title><link href="https://ai3-genai4sci.github.io/lab-website/2025/03/25/GroupPost-WorkAlign.html" rel="alternate" type="text/html" title="GroupPost20250325-WorkAlign" /><published>2025-03-25T00:00:00+00:00</published><updated>2025-04-10T08:53:51+00:00</updated><id>https://ai3-genai4sci.github.io/lab-website/2025/03/25/GroupPost-WorkAlign</id><content type="html" xml:base="https://ai3-genai4sci.github.io/lab-website/2025/03/25/GroupPost-WorkAlign.html"><![CDATA[<p>Internal Report - Meeting Minutes Link: <a href="https://q47cvaon6v.feishu.cn/wiki/KANIwrOd1iAQ8DkBzidcI8Sanqe?from=from_copylink">feishu docs</a>. To view or edit, please request access.</p>]]></content><author><name>All team members</name></author><category term="biology" /><category term="generative-model" /><category term="LLM" /><summary type="html"><![CDATA[Internal Report - Meeting Minutes Link: feishu docs. To view or edit, please request access.]]></summary></entry><entry><title type="html">GroupPost20240304-COT Compress</title><link href="https://ai3-genai4sci.github.io/lab-website/2025/03/04/GroupPost-COT-Compress.html" rel="alternate" type="text/html" title="GroupPost20240304-COT Compress" /><published>2025-03-04T00:00:00+00:00</published><updated>2025-04-10T08:53:51+00:00</updated><id>https://ai3-genai4sci.github.io/lab-website/2025/03/04/GroupPost-COT-Compress</id><content type="html" xml:base="https://ai3-genai4sci.github.io/lab-website/2025/03/04/GroupPost-COT-Compress.html"><![CDATA[<!-- excerpt start -->

<p><strong>Report Overview</strong></p>

<p>This report explores Chain-of-Thought (COT) compression techniques aimed at reducing the computational cost and enhancing the efficiency of large language models (LLMs) in complex reasoning tasks. While COT improves LLM performance by generating intermediate reasoning steps, the length of these reasoning chains leads to increased computational demands. COT compression seeks to shorten these chains without significantly impacting model performance.</p>

<p><!-- excerpt end --></p>

<p><strong>Key Strategies and Methods</strong></p>

<p>The report covers several key strategies for COT compression:</p>

<ol>
  <li><strong>Explicit Compression During Training:</strong>
    <ul>
      <li><strong>Knowledge Distillation:</strong> This involves transferring knowledge from a complex “System 2” model (which generates COT) to a more efficient “System 1” model (which directly outputs results), thereby speeding up the inference process.
        <ul>
          <li>Notable work: Meta’s “Distilling System 2 into System 1.”</li>
        </ul>
      </li>
      <li><strong>Step-by-Step Training:</strong> This method iteratively trains models to identify and skip redundant reasoning steps, effectively shortening the reasoning pathway.
        <ul>
          <li>Notable work: Research by Qiu et al. in “Can Language Models Learn to Skip Steps?”.</li>
        </ul>
      </li>
      <li><strong>Data-Conditioned Training:</strong> Utilizes GPT-4 to create pairs of long and short COT data, training models to generate reasoning chains of varying lengths.
        <ul>
          <li>Notable work: Research by BeiKe, “C3OT: Generating Shorter Chain-of-Thought without Compromising Effectiveness.”</li>
        </ul>
      </li>
    </ul>
  </li>
  <li><strong>Hidden State Compression:</strong>
    <ul>
      <li><strong>Implicit COT:</strong> Gradually internalizes the explicit COT reasoning process into the model’s hidden states, eliminating the need for explicit reasoning chains.
        <ul>
          <li>Notable work: Research by Yejin Choi et al., “From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step.”</li>
        </ul>
      </li>
      <li><strong>Compressed COT (CCOT):</strong> Employs densely semantic tokens to represent compressed reasoning, compressing reasoning chains within the hidden space.
        <ul>
          <li>Notable work: Research by JHU, “Compressed Chain of Thought: Efficient Reasoning Through Dense Representations.”</li>
        </ul>
      </li>
    </ul>
  </li>
  <li><strong>Dynamic Length Control:</strong>
    <ul>
      <li><strong>COT-Valve:</strong> Introduces adjustable parameters within the model’s parameter space to dynamically manage the length of reasoning chains.
        <ul>
          <li>Notable work: Research by NUS, “COT-Valve: Length-Compressible Chain-of-Thought Tuning.”</li>
        </ul>
      </li>
    </ul>
  </li>
  <li><strong>Reinforcement Learning Compression:</strong>
    <ul>
      <li><strong>O1-Pruner:</strong> Designs a length-harmonizing reward function to guide models in generating shorter yet accurate reasoning sequences through reinforcement learning.
        <ul>
          <li>Notable work: Research by Sun Yat-sen University, “O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning.”</li>
        </ul>
      </li>
      <li><strong>Kimi k1.5:</strong> Explores various reinforcement learning compression strategies, including model merging, shortest rejection sampling, DPO, and Long2short RL.
        <ul>
          <li>Notable work: Kimi 1.5 Technical Report.</li>
        </ul>
      </li>
    </ul>
  </li>
</ol>

<p><strong>Key Research Trends</strong></p>

<ul>
  <li>A shift from explicit token compression to implicit representation compression, focusing on more efficient reasoning methods.</li>
  <li>The growing prominence of reinforcement learning in reasoning chain compression.</li>
  <li>The exploration of multi-agent frameworks for social reasoning.</li>
</ul>

<p><strong>Future Outlook</strong></p>

<ul>
  <li>As models continue to scale, managing inference costs will become increasingly critical.</li>
  <li>The development of more effective compression techniques is essential for deploying LLMs in resource-constrained environments.</li>
</ul>

<p><strong>Report Significance</strong></p>

<ul>
  <li>Provides a comprehensive overview of COT compression technologies, highlighting the latest advancements in the field.</li>
  <li>Offers practical insights for optimizing LLM performance and efficiency through appropriate compression strategies.</li>
</ul>

<p>The specific files can be found here: <a href="https://github.com/AI3-GenAI4Sci/lab-website/blob/main/docs/TrendReport_ChainCompression-20250304.pdf">Trend Report.pdf</a></p>]]></content><author><name>Rui Xu</name></author><category term="LLM" /><category term="reinforce-learning" /><category term="COT" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">GroupPost20250225-WorkAlign</title><link href="https://ai3-genai4sci.github.io/lab-website/2025/02/25/GroupPost-WorkAlign.html" rel="alternate" type="text/html" title="GroupPost20250225-WorkAlign" /><published>2025-02-25T00:00:00+00:00</published><updated>2025-04-10T08:53:51+00:00</updated><id>https://ai3-genai4sci.github.io/lab-website/2025/02/25/GroupPost-WorkAlign</id><content type="html" xml:base="https://ai3-genai4sci.github.io/lab-website/2025/02/25/GroupPost-WorkAlign.html"><![CDATA[<p>Internal Report - Meeting Minutes Link: <a href="https://q47cvaon6v.feishu.cn/wiki/PJBjwZsiIieor0k2Vkocbj6Wngh?from=from_copylink">feishu docs</a>. To view or edit, please request access.</p>]]></content><author><name>All team members</name></author><category term="biology" /><category term="medicine" /><category term="reinforce learning" /><summary type="html"><![CDATA[Internal Report - Meeting Minutes Link: feishu docs. To view or edit, please request access.]]></summary></entry><entry><title type="html">GroupPost20241119-RFDiffusion</title><link href="https://ai3-genai4sci.github.io/lab-website/2024/11/19/GroupPost-RFDiffusion.html" rel="alternate" type="text/html" title="GroupPost20241119-RFDiffusion" /><published>2024-11-19T00:00:00+00:00</published><updated>2025-04-10T08:53:51+00:00</updated><id>https://ai3-genai4sci.github.io/lab-website/2024/11/19/GroupPost-RFDiffusion</id><content type="html" xml:base="https://ai3-genai4sci.github.io/lab-website/2024/11/19/GroupPost-RFDiffusion.html"><![CDATA[<!-- excerpt start -->
<p>RFdiffusion is a general protein design framework based on diffusion models, capable of de novo design of binders and higher-order symmetric cyclic polymers, among other types of proteins.</p>

<p>Mr.Shi shared the classic RFDiffusion algorithm and provided a brief introduction to</p>

<ul>
  <li>the basics of proteins</li>
  <li>RFD training methods</li>
  <li>applications
 <!-- excerpt end --></li>
</ul>

<p>The specific files can be found here: <a href="https://github.com/AI3-GenAI4Sci/lab-website/blob/main/docs/RFDiffusion-20241119.pdf">RFDiffusion.pdf</a></p>]]></content><author><name>Shaojie Shi</name></author><category term="biology" /><category term="medicine" /><category term="Generative Model" /><summary type="html"><![CDATA[RFdiffusion is a general protein design framework based on diffusion models, capable of de novo design of binders and higher-order symmetric cyclic polymers, among other types of proteins.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://ai3-genai4sci.github.io/lab-website/images/RFDiffusion.jpg" /><media:content medium="image" url="https://ai3-genai4sci.github.io/lab-website/images/RFDiffusion.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">GroupPost20241105-AlphaFold2 &amp;amp; PeptideGPT</title><link href="https://ai3-genai4sci.github.io/lab-website/2024/11/05/GroupPost-AlphaFold2&PeptideGPT.html" rel="alternate" type="text/html" title="GroupPost20241105-AlphaFold2 &amp;amp; PeptideGPT" /><published>2024-11-05T00:00:00+00:00</published><updated>2025-04-10T08:53:51+00:00</updated><id>https://ai3-genai4sci.github.io/lab-website/2024/11/05/GroupPost-AlphaFold2&amp;PeptideGPT</id><content type="html" xml:base="https://ai3-genai4sci.github.io/lab-website/2024/11/05/GroupPost-AlphaFold2&amp;PeptideGPT.html"><![CDATA[<!-- excerpt start -->
<p>AF2, the champion of the 2021 CASP competition, is also the work that won the 2024 Nobel Prize and holds significant importance for protein engineering.</p>

<p>PeptideGPT is a pipeline to generate protein sequences with specific functions.</p>

<p><!-- excerpt end --></p>

<p>The following is an overview of the presentations by Bohao Lv:</p>

<p>PeptideGPT：</p>
<ul>
  <li>Finetuned the existing protein sequence generation model ProtGPT2 using protein data with specific functions to generate protein sequences with specific functions.</li>
  <li>Utilized bioinformatics knowledge to perform the first round of sequence rationality screening on the generated sequences.</li>
  <li>Used the structure prediction model ESMFold to perform the second round of rational structure screening on the generated sequences.</li>
  <li>Employed a classifier to validate the functions of the generated protein sequences.</li>
</ul>

<p>AF2</p>
<ul>
  <li>Decoder part
    <ul>
      <li>IPA</li>
      <li>Backbone predict</li>
      <li>atom predict</li>
    </ul>
  </li>
  <li>Loss function</li>
</ul>

<p>The specific files can be found here: <a href="https://github.com/AI3-GenAI4Sci/lab-website/blob/main/docs/AF2-20241119.pdf">AF2.pdf</a> &amp; <a href="https://zhuanlan.zhihu.com/p/4412467760">PeptideGPT Blog</a></p>]]></content><author><name>Bohao Lv</name></author><category term="biology" /><category term="medicine" /><category term="generative-model" /><summary type="html"><![CDATA[AF2, the champion of the 2021 CASP competition, is also the work that won the 2024 Nobel Prize and holds significant importance for protein engineering.]]></summary></entry><entry><title type="html">GroupPost20241029-Maple &amp;amp; AlphaFold2</title><link href="https://ai3-genai4sci.github.io/lab-website/2024/10/29/GroupPost-MAPLE&AlphaFold2.html" rel="alternate" type="text/html" title="GroupPost20241029-Maple &amp;amp; AlphaFold2" /><published>2024-10-29T00:00:00+00:00</published><updated>2025-04-10T08:53:51+00:00</updated><id>https://ai3-genai4sci.github.io/lab-website/2024/10/29/GroupPost-MAPLE&amp;AlphaFold2</id><content type="html" xml:base="https://ai3-genai4sci.github.io/lab-website/2024/10/29/GroupPost-MAPLE&amp;AlphaFold2.html"><![CDATA[<!-- excerpt start -->
<p>MAPLE can be used to predict methylation age and disease risk. It achieves stable and precise results by eliminating batch effects through contrastive learning methods.</p>

<p>AF2, the champion of the 2021 CASP competition, is also the work that won the 2024 Nobel Prize and holds significant importance for protein engineering.
 <!-- excerpt end --></p>

<p>The following is an overview of the presentations by Yu Zhang &amp; Bohao Lv:</p>

<p>MAPLE：</p>
<ul>
  <li>Predicting an individual’s age and disease probability through methylation data</li>
  <li>Using contrastive learning methods to eliminate batch effects between methylation data from different sources</li>
  <li>Capturing biological factors related to disease risk using MAPLE</li>
  <li>Analyzing MAPLE results under the framework of aging biology</li>
</ul>

<p>AF2</p>
<ul>
  <li>Input feature construction: Multiple Sequence Alignment (MSA) + Pair representation</li>
  <li>Encoding part Evoformer:
    <ul>
      <li>MSA representation update: seq-based pair-bias self-attention + residues-based self-attention</li>
      <li>Pair representation update: Triangular multiplicative + Triangular self-attention</li>
    </ul>
  </li>
</ul>

<p>The specific files can be found here: <a href="https://github.com/AI3-GenAI4Sci/lab-website/blob/main/docs/MAPLE-20241119.pdf">MAPLE.pdf</a> &amp; <a href="https://github.com/AI3-GenAI4Sci/lab-website/blob/main/docs/AF2-20241119.pdf">AF2.pdf</a></p>]]></content><author><name>Yu Zhang &amp; Bohao Lv</name></author><category term="biology" /><category term="medicine" /><category term="contrastive learning" /><summary type="html"><![CDATA[MAPLE can be used to predict methylation age and disease risk. It achieves stable and precise results by eliminating batch effects through contrastive learning methods.]]></summary></entry></feed>