<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>My Digital Library on Everyday Is A School Day</title>
    <link>https://www.kenkoonwong.com/blog/</link>
    <description>Recent content in My Digital Library on Everyday Is A School Day</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en</language>
    <lastBuildDate>Mon, 02 Mar 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://www.kenkoonwong.com/blog/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Learning Carbapenemase Producing Genes</title>
      <link>https://www.kenkoonwong.com/blog/cre/</link>
      <pubDate>Mon, 02 Mar 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.kenkoonwong.com/blog/cre/</guid>
      <description>&lt;script src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/cre/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;blockquote&gt;
&lt;p&gt;Learnt major carbapenemase genes (KPC, NDM, OXA) using NCBI isolate data and molecular dynamics. Includes gene frequency trends, co-resistance patterns, and MM/PBSA binding comparisons of avibactam with KPC vs NDM to illustrate mechanistic differences 🧬&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2 id=&#34;motivations&#34;&gt;Motivations:
  &lt;a href=&#34;#motivations&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;We&amp;rsquo;ve previously learnt about 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/amr/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ESBL genes&lt;/a&gt; and took a peek under the hood of the nucleotides and explore NCBI library and assess their frequency. Why not let&amp;rsquo;s pick another AMR gene and learn! Let&amp;rsquo;s explore Carbapenemase producing organisms! In this blog, we&amp;rsquo;ll spare you the code, as we basically use the same workflow previous, change a few search keys and variables and out comes the result!&lt;/p&gt;




&lt;h2 id=&#34;objectives&#34;&gt;Objectives:
  &lt;a href=&#34;#objectives&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#genes&#34;&gt;Which are carbapenemase producing genes?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#how&#34;&gt;How are we going to do this?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#results&#34;&gt;Results&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#noncre&#34;&gt;The Proportion of Carbapenemase producing Genes in Meropenem Resistant Organism&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#freq&#34;&gt;The Frequency of Carbapanemase Genes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#anim&#34;&gt;Visualize Carbapanemase Gene Frequency By Year&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#coresistant&#34;&gt;Do MBLs Frequently Have Co-resistance Of Other Carbapenamase?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#mdsim&#34;&gt;Let&amp;rsquo;s Take a Look At Avibactam-KPC and Avibactam-NDM Molecular Dynamic Simulation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#opportunities&#34;&gt;Opportunities for improvement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#lesson&#34;&gt;Lessons learnt&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;genes&#34;&gt;Which Are Carbapenemase Producing Genes?
  &lt;a href=&#34;#genes&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;These are the major carbapanemase: &lt;br&gt;
class A - KPC. &lt;br&gt;
class B Metallo-B-lactamases (MBLs) - NDM, VIM, IMP. &lt;br&gt;
class D - OXA-48, OXA-181, OXA-232, OXA-244.&lt;/p&gt;
&lt;p&gt;Carbapenemases are classified into three molecular classes based on their hydrolytic mechanism. Class A &amp;amp; D carbapenemases all utilizing a serine-based active site. Class B MBLs rely on zinc in their active site. This is interesting, because while Avibactam was developed to have activities against Class A, C, and D beta-lactamases. However, it does not work on Class B! 😵‍💫 Hence, before susceptibility result is back, knowing which carbapenemase exists would be ideal. In fact the NG-Test CARBA 5 (also called &amp;ldquo;Carba 5&amp;rdquo;) is a rapid immunochromatographic lateral flow assay that detects and differentiates the five most common carbapenemase families: KPC, NDM, VIM, IMP, and OXA-48-like. I heard it takes about 15 minutes to run one colony.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2Fwww.ngbiotech.com%2Fwp-content%2Fuploads%2F2021%2F08%2FVisuel-cassette-Carba.jpg&amp;f=1&amp;nofb=1&amp;ipt=200922139ac4aa79fdcf315ea8633e5888482cfc6c4b1fcaf9259fc69b75d946&#34; alt=&#34;image&#34; width=&#34;40%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;Interesting thing on MBL-producing Enterobacterales, the IDSA recommends either &lt;code&gt;ceftazidime-avibactam + aztreonam&lt;/code&gt; combination therapy or &lt;code&gt;cefiderocol monotherapy&lt;/code&gt;. The rationale for the combination is that aztreonam (a monobactam) is stable against metallo-β-lactamases, while avibactam inhibits the serine β-lactamases (ESBLs, AmpC, KPC, OXA-48-like) that frequently co-exist in MBL-producing organisms and would otherwise hydrolyze aztreonam.&lt;/p&gt;




&lt;h2 id=&#34;how&#34;&gt;How Are We Going To Do This?
  &lt;a href=&#34;#how&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Well, first of all, let&amp;rsquo;s get all NCBI bacterial isolates fasta with meropenem resistance 
&lt;a href=&#34;https://www.ncbi.nlm.nih.gov/pathogens/isolates/#&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;here&lt;/a&gt;. Then download all carbapenemase producing genese, 
&lt;a href=&#34;https://www.ncbi.nlm.nih.gov/pathogens/refgene/#&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;here&lt;/a&gt;, then insert this &lt;code&gt;gene_family:(blaKPC blaNDM blaVIM blaIMP blaOXA-48 blaOXA-181 blaOXA-232 blaOXA-244)&lt;/code&gt; to the filter.&lt;/p&gt;
&lt;p&gt;Then, run through the code we had previously 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/amr/#allin&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;here&lt;/a&gt;, assess exact match and visualize the frequency just like before! Assess what are the proportions of these fastas do not match our carbapenemase producing genes, as not all meropenem resistance is due to beta lactamases, some could be due to porin loss, overexpression of efflux pump, ESBL + porin loss, AmpC + porin loss!&lt;/p&gt;
&lt;p&gt;After that, let&amp;rsquo;s visualize the distribution based on the submission dates.&lt;/p&gt;
&lt;p&gt;Remember we mentioned that with MBLs we either need to use combination therapy of aztreonam + ceftaz/avibactam or cefidericol monotherapy due to co-existence of other resistance? Let&amp;rsquo;s take a look at all MBLs and see what other carbapenemase genes co-exist!&lt;/p&gt;
&lt;p&gt;Lastly, lets test out our molecular dynamic experiment on KPC and NDM with avibactam! We should see a strong binding affinity for KPC and a very weak binding affinity for NDM! We&amp;rsquo;re going to include another post-simulation process called MM/PBSA and MM/GBSA as well. What that does it calculates the binding free energy of the ligand to the protein. The more negative the value, the stronger the binding affinity. This is a great way to quantify our results from our simulation and compare between different simulations!&lt;/p&gt;




&lt;h2 id=&#34;carba&#34;&gt;Results
  &lt;a href=&#34;#carba&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;




&lt;h3 id=&#34;noncre&#34;&gt;The Proportion of Carbapenemase producing Genes in Meropenem Resistant Organism?
  &lt;a href=&#34;#noncre&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;We have a total of 1027 isolates, and 45.86% have detected carbapenamase genes! That&amp;rsquo;s with exact match, I did not check for low level mismatches.&lt;/p&gt;




&lt;h3 id=&#34;freq&#34;&gt;The Frequency of Carbapanemase Genes
  &lt;a href=&#34;#freq&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Wow, NDM-1 is at the top? I&amp;rsquo;ve always thought KPCs is more frequent. Note, these are the sequences that were submitted to NCBI, not necessarily resembling the actual distribution in the real world. But still, interesting to see that NDM-1 is more frequently submitted than KPCs.&lt;/p&gt;




&lt;h3 id=&#34;anim&#34;&gt;Visualize Carbapanemase Gene Frequency By Year
  &lt;a href=&#34;#anim&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;cre_anim.gif&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;Looking at the above, KPCs submissions were dominating up until 2018. In 2019, NDM-1 started to creep up to top 1. Since then, NDM-1 remained number one until 2025.&lt;/p&gt;




&lt;h3 id=&#34;coresistant&#34;&gt;Do MBLs Frequently Have Co-resistance Of Other Carbapenamase?
  &lt;a href=&#34;#coresistant&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; primary_gene &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; gene &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; n &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; NDM-1 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; OXA-48 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 13 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; NDM-5 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; OXA-48 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 12 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; NDM-1 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; OXA-232 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 5 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; NDM-5 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; OXA-181 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 5 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; NDM-5 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; OXA-232 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; NDM-7 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; OXA-232 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; NDM-1 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; NDM-4 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 1 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; NDM-1 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; VIM-1 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 1 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; NDM-1 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; VIM-2 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 1 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Interestingly when we look at MBL with co-resistance, it&amp;rsquo;s usually OXA-48 and OXA-48-like! But, when looking at all the MBLs (n=257) that were submitted, co-resistance with OXA is only 15.18% (n=39). There were 3 NDMs with another MBL. That makes sense to combine aztreonam and ceftaz/avibactam to counter OXA-48 beta lactamase. Note that we did not include ESBLs, ampC on our search.&lt;/p&gt;




&lt;h3 id=&#34;mdsim&#34;&gt;Let&amp;rsquo;s Take a Look At Avibactam-KPC and Avibactam-NDM Molecular Dynamic Simulation
  &lt;a href=&#34;#mdsim&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;We&amp;rsquo;ll use the same pipeline as before, but this time we&amp;rsquo;ll add Molecular Mechanic / PBSA.&lt;/p&gt;




&lt;h4 id=&#34;installation&#34;&gt;Installation
  &lt;a href=&#34;#installation&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;conda create -n gmxpbsa -c conda-forge gmx_mmpbsa ambertools
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;conda activate gmxpbsa
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4 id=&#34;write-mmpbsain&#34;&gt;Write &lt;code&gt;mmpbsa.in&lt;/code&gt;
  &lt;a href=&#34;#write-mmpbsain&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;p&gt;You can use &lt;code&gt;nano&lt;/code&gt; or &lt;code&gt;nvim&lt;/code&gt;, then paste the below&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&amp;amp;general
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   &lt;span style=&#34;color:#008080&#34;&gt;startframe&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;1,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   &lt;span style=&#34;color:#008080&#34;&gt;endframe&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;500,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   &lt;span style=&#34;color:#008080&#34;&gt;interval&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;5,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   &lt;span style=&#34;color:#008080&#34;&gt;verbose&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;2,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;/
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&amp;amp;pb
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   &lt;span style=&#34;color:#008080&#34;&gt;istrng&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;0.150,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   &lt;span style=&#34;color:#008080&#34;&gt;fillratio&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;4.0,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   &lt;span style=&#34;color:#008080&#34;&gt;inp&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;1,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   &lt;span style=&#34;color:#008080&#34;&gt;radiopt&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;0,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;/
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;blockquote&gt;
&lt;p&gt;Note: Make sure to change the startframe and endframe to where the the protein rmsd and ligand rmsd is stable. Essentially sampling from the stable portion of the simulation.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;When running MM/GBSA use the parameters below, maybe name is &lt;code&gt;mmgbsa.in&lt;/code&gt;&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&amp;amp;general
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#008080&#34;&gt;startframe&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; 1,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#008080&#34;&gt;endframe&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; 500,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#008080&#34;&gt;interval&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; 1,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#008080&#34;&gt;verbose&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; 1,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#008080&#34;&gt;keep_files&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; 0,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;/
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&amp;amp;gb
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#008080&#34;&gt;igb&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; 5,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#008080&#34;&gt;saltcon&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; 0.15,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;/
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4 id=&#34;to-run-mmpbsa-or-mmgbsa&#34;&gt;To Run MM/PBSA or MM/GBSA
  &lt;a href=&#34;#to-run-mmpbsa-or-mmgbsa&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx_MMPBSA &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -O &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -i mmpbsa.in &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -cs md.tpr &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -ct md_noPBC.xtc &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -ci index.ndx &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -cg &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;13&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -cp topol.top &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -o FINAL_RESULTS.dat &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -eo FINAL_RESULTS.csv
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;blockquote&gt;
&lt;p&gt;Note: Make sure to write our regular pipeline first, create index etc, and leave the MM/PBSA or MM/GBSA until last.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Equation of the bond:
Delta G_bind = Delta G_gas + Delta G_solv
Where:
Delta G_gas = gas-phase molecular mechanics energy (bonds, angles, dihedrals, van der Waals, electrostatics) &lt;code&gt;BOND+ANGLE+DIHED+VDWAALS+EEL+1-4 VDW+1-4 EEL&lt;/code&gt;
Delta G_solv = solvation free energy (how the molecule interacts with the surrounding water) &lt;code&gt;EPB+ENPOLAR&lt;/code&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: when you read the FINAL_RESULTS.csv, go to the last section, that&amp;rsquo;s the delta. Total = sum of all the columns except Total. Negative == 👍&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h4 id=&#34;kpc181-avibactam&#34;&gt;KPC181-Avibactam
  &lt;a href=&#34;#kpc181-avibactam&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-13-1.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-13-2.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-13-3.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-13-4.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-13-5.png&#34; width=&#34;1152&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Alright, with the above, we have RMSD plateaud at around 20ns and RMSD ligand is pretty good and stable as well. Along with good H bonds and interaction energy, and reducing and converged minimal distance between protein and ligand and also distance between center of protein and ligand reduced and stablized as well. Let&amp;rsquo;s visualize the first frame and last frame to ensure before we run MM/PBSA and MM/GBSA.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;kpc181.gif&#34; alt=&#34;image&#34; width=&#34;100%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;Looks convincing! Let&amp;rsquo;s take a look at MM/PBSA and MM/GBSA.&lt;/p&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-14-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Alright, we have quite a few columns in facets here, but most are not helpful since we have 0 data mainly because we used inp=1. But total binding energy is negative, which is good! You can see that on the TOTAL. The median (IQR) is -8.07(-16.79 - -1.29). Now what about GBPA ?&lt;/p&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-15-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;And the Median Total (IQR) is &lt;code&gt;-10.615(-19.84 - -3.38)&lt;/code&gt; . Not too shabby!&lt;/p&gt;




&lt;h4 id=&#34;ndm1-avibactam&#34;&gt;NDM1-Avibactam
  &lt;a href=&#34;#ndm1-avibactam&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;p&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-16-1.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-16-2.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-16-3.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-16-4.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-16-5.png&#34; width=&#34;1152&#34; /&gt;&lt;/p&gt;
&lt;p&gt;RMSD ligand seems quite big, hbond is intermittent, one of interaction energy crosses zero, the min distance variance is quite wide towards the end of simulation. From these numbers, it looked like it did not bind well, which is expected since avibactam does not work on MBLs. Let&amp;rsquo;s see visualize.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;ndm1.gif&#34; alt=&#34;image&#34; width=&#34;100%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;Wow, they&amp;rsquo;re not in the same position! It completely flipped! Usually in this setting, we shouldn&amp;rsquo;t need to perform MM/PBSA. But what if we did?&lt;/p&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/cre/index_files/figure-html/unnamed-chunk-17-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Wow, even though looking at TOTAL the median is in the negative, but you see significant fluctuations (large variance) with lots of zeros! Compare this to our previous MM/PBSA and MM/GBSA, you can see the difference!&lt;/p&gt;
&lt;p&gt;What is interesting is that, I would imagine the ligand would have drifted away but it didn&amp;rsquo;t. Let&amp;rsquo;s investigate.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;ndm1_pose.gif&#34; alt=&#34;image&#34; width=&#34;100%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;I think the ligand is stuck in the pocket as the protein was undergoing conformational change and stabilized after ~30ns, and I think it may have trapped the ligand in the pocket, hence it didn&amp;rsquo;t drift out of the pocket. I think 🤔 . Also, take note that the early simulation animation was without surface (only atoms), whereas the later one did have surface. With the matched protein and ligand of frame 0 and last frame proved that the initial pose was not optimal.&lt;/p&gt;
&lt;p&gt;There you have it!&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: While MM/PBSA provides relative binding estimates, these simulations do not capture full enzymatic hydrolysis dynamics and should be interpreted as comparative rather than absolute&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2 id=&#34;opportunities&#34;&gt;Opportunities for improvement
  &lt;a href=&#34;#opportunities&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;need to learn ampC, porin loss, and other MDR genes&lt;/li&gt;
&lt;li&gt;need to test our MM/PBSA and MM/GBSA more, adjust isp=2 and see how they look like&lt;/li&gt;
&lt;li&gt;need to find a proper way to interpret/assess alphafold proteins prediction, which is considered good, which isn&amp;rsquo;t, and how to deal with them&lt;/li&gt;
&lt;li&gt;need to dive into the math, physics, and organic chemistry of these simulations.&lt;/li&gt;
&lt;li&gt;need to do replicates of 3, report seeds, and also include 2 other similar coordinate poses with high scores.&lt;/li&gt;
&lt;li&gt;need to learnt covalent docking&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;lesson&#34;&gt;Lessons Learnt
  &lt;a href=&#34;#lesson&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;learnt carbapenemase genes&lt;/li&gt;
&lt;li&gt;learnt MBLs may contain co-resistance of OXA, hence combo aztreonam&lt;/li&gt;
&lt;li&gt;learnt MM/PBSA and MM/GBSA&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you like this article:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;please feel free to send me a 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;comment or visit my other blogs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;please feel free to follow me on 
&lt;a href=&#34;https://bsky.app/profile/kenkoonwong.bsky.social&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;BlueSky&lt;/a&gt;, 
&lt;a href=&#34;https://twitter.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;twitter&lt;/a&gt;, 
&lt;a href=&#34;https://github.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GitHub&lt;/a&gt; or 
&lt;a href=&#34;https://rstats.me/@kenkoonwong&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Mastodon&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;if you would like collaborate please feel free to 
&lt;a href=&#34;https://www.kenkoonwong.com/contact/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;contact me&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>Assessing TEM, CTX-M, and KPC-2 With Molecular Docking and Molecular Dynamic Simulation</title>
      <link>https://www.kenkoonwong.com/blog/mdsim2/</link>
      <pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.kenkoonwong.com/blog/mdsim2/</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;🧬 Testing beta-lactamase resistance with AlphaFold + DiffDock + GROMACS! Watch clavulanic acid bind TEM-5,  CTX-M-15, KPC-2, and get rejected by TEM-30. Simulation confirms biology! 🔬💊&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2 id=&#34;motivations&#34;&gt;Motivations
  &lt;a href=&#34;#motivations&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Now that we&amp;rsquo;ve learnt the pipeline of molecular docking and molecular dynamic simulation. Let&amp;rsquo;s check some of the other proteins and ligand and see if these make sense. Let&amp;rsquo;s assess ESBL beta lactamase since we did something before (
&lt;a href=&#34;https://www.kenkoonwong.com/blog/amr/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;here&lt;/a&gt;). Let&amp;rsquo;s check some of the hypothesis here, some TEMs are inhibitor resistant such as TEM-30, whereas ESBL beta lactamases like TEM-5 and CTX-M-15 can be susceptible to beta lactamase inhibitor such as clavulanic acid, whereas TEM-30, we should see that it won&amp;rsquo;t bind. We&amp;rsquo;ll also assess KPC2, which is a carbapanemase, it does bind to clavulanic acid, however, instead of being an inhibtor, it actually hydrolyzes it. But let&amp;rsquo;s see what MD results we get!&lt;/p&gt;
&lt;p&gt;This time, we will get protein sequence of interest and use &lt;code&gt;AlphaFold Server&lt;/code&gt; to predict our protein and generate pdb, and instead of relying on known coordinates of the protein, let&amp;rsquo;s use &lt;code&gt;DiffDock&lt;/code&gt; to screen the best docking site, then use that for MD simulation. We will write some instructions on DiffDock, how to install and run it. But we will leave off the MD simulation part, we can refer to 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/mdsim/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;here&lt;/a&gt;. But I will run a good chunk of code to assess our results so that at least I don&amp;rsquo;t have to copy and paste multiple times to assess RMSD, RMSF, hydrogen bonds, gyration, and distance. We will also have some cool visuals! Let&amp;rsquo;s test this out!&lt;/p&gt;




&lt;h2 id=&#34;objectives&#34;&gt;Objectives:
  &lt;a href=&#34;#objectives&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#proteins&#34;&gt;Get Our Proteins&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#alphafold&#34;&gt;Protein Prediction With AlphaFold&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#diffdock&#34;&gt;Using DiffDock to predict binding sites&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#results&#34;&gt;Assess Results&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#tem5&#34;&gt;TEM-5&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#ctxm15&#34;&gt;CTX-M-15&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#tem30&#34;&gt;TEM-30&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#opportunities&#34;&gt;Opportunities for improvement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#lessons&#34;&gt;Lessons Learnt&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;proteins&#34;&gt;Get Our Proteins
  &lt;a href=&#34;#proteins&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;We go 
&lt;a href=&#34;https://www.ncbi.nlm.nih.gov/pathogens/refgene/#gene_family:%28blaTEM%29&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;here&lt;/a&gt; and select the gene of interest and go to &lt;code&gt;Refseq&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;tem_db.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Then click on &lt;code&gt;Fasta&lt;/code&gt; which will bring us to this.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;refseq.png&#34; alt=&#34;&#34;&gt;
Copy the protein sequence and we&amp;rsquo;ll move on to our next step. Also don&amp;rsquo;t forget to copy the other proteins of interest, in our case TEM-30, CTX-M-15, and KPC2.&lt;/p&gt;




&lt;h2 id=&#34;alphafold&#34;&gt;Protein Prediction With AlphaFold
  &lt;a href=&#34;#alphafold&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;We will then visit 
&lt;a href=&#34;https://alphafoldserver.com/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Alphafold Server&lt;/a&gt;, you will have to login and then enter the sequence, and continue. Make sure to rename it to something you can recognize. Let it run, will likely take a minute. Then download it. Unzip it and you will see 5 &lt;code&gt;cif&lt;/code&gt; files. I usually use the first one which ends with &lt;code&gt;0&lt;/code&gt;.&lt;/p&gt;




&lt;h2 id=&#34;diffdock&#34;&gt;Using DiffDock to predict binding sites
  &lt;a href=&#34;#diffdock&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;




&lt;h3 id=&#34;install&#34;&gt;Install
  &lt;a href=&#34;#install&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;git clone https://github.com/gcorso/DiffDock.git
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;conda env create --file environment.yml
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;conda activate diffdock
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;You can check the 
&lt;a href=&#34;https://github.com/gcorso/DiffDock&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;github&lt;/a&gt; link out. From my experience, the above worked on Ubuntu but failed on my mac. I actually had to use Claude Code to help me install it. Could be just a me thing.&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s take a look at this protein with 
&lt;a href=&#34;https://www.cgl.ucsf.edu/chimerax/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ChimeraX&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;tem5.gif&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Do you see the big protein with a long tail? The long tail has low prediction score and is likely to be disordered, so we will remove that part and just keep the globular part. We can do that with ChimeraX, just select the tail and delete it. Then save it as a new pdb file.&lt;/p&gt;
&lt;p&gt;Then we will run DiffDock to predict the binding site. We will use &lt;code&gt;clavulanic acid&lt;/code&gt; as our ligand. You can download it from 
&lt;a href=&#34;https://pubchem.ncbi.nlm.nih.gov/compound/Clavulanic-acid&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;PubChem&lt;/a&gt;. Then we will run DiffDock with the following command:ea, we should be very careful not to use it.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;obabel alphafold0.cif -O tem5.pdb
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# make sure you&amp;#39;re in DiffDock folder and has activate your conda env that used to install&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;python3 inference.py &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  --protein_path tem5.pdb &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  --ligand &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;OC(=O)[C@@H]2/C=C\1OC[C@@H](O)/C1=C\N2&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;\ &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  --out_dir results_tem5_step20 &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  --inference_steps &lt;span style=&#34;color:#099&#34;&gt;20&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  --samples_per_complex &lt;span style=&#34;color:#099&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  --batch_size &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# create ligand pdb&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;obabel rank1.sdf -O clav_acid.pdb
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;You see the ligand SMILES? 😀 We don&amp;rsquo;t need to download one from pubchem. You can just copy the SMILES from it. Woo hoo! Once you ran the above it will have 10 ranks. Open (with ChimeraX) the protein pdb first and then all the rank docked positions to visually assess the predicted docking sites.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;diffdocked.png&#34; alt=&#34;&#34;&gt;
Alright, it looks like there is one pocket (including the top 1 ranked position) is in that one pocket that in the middle of the picture.&lt;/p&gt;
&lt;p&gt;If you take a look at the files produced by DiffDock, it ranks its confidence score at the end of the file, the lower the better.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;list.png&#34; alt=&#34;image&#34; width=&#34;40%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;Alright, go through the 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/mdsim/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;MD simulation procedure&lt;/a&gt;, wait and assess!&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: I found that I could use 2 GTX 1080 and run each GPU with one protein-ligand interaction, and also use &amp;amp;&amp;amp; and chain multiple mdrun commands so that it can go from one simulation to another after it&amp;rsquo;s done! For example gmx mdrun blab blah &amp;amp;&amp;amp; cd /path/to/another &amp;amp;&amp;amp; gmx mdrun blah blah. Also, make sure to use tmux when you&amp;rsquo;re running the production.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2 id=&#34;results&#34;&gt;Assess Results
  &lt;a href=&#34;#results&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; -e &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;\&amp;#34;Protein\&amp;#34; | \&amp;#34;UNL\&amp;#34;\nq&amp;#34;&lt;/span&gt; | gmx make_ndx -f md.gro -o index.ndx
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;0&amp;#34;&lt;/span&gt; | gmx trjconv -s md.tpr -f md.xtc -o md_nojump.xtc -pbc nojump
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;Protein_UNL System&amp;#34;&lt;/span&gt; | gmx trjconv -s md.tpr -f md_nojump.xtc -o md_noPBC.xtc -pbc mol -center -n index.ndx
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;4 13&amp;#39;&lt;/span&gt; | gmx rms -s md.tpr -f md_noPBC.xtc -o rmsd_ligand.xvg -tu ns -n index.ndx
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;4 4&amp;#39;&lt;/span&gt; | gmx rms -s md.tpr -f md_noPBC.xtc -o rmsd_protein.xvg -tu ns -n index.ndx
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt; | gmx rmsf -s md.tpr -f md_noPBC.xtc -o rmsf_protein.xvg -res 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;13&lt;/span&gt; | gmx rmsf -s md.tpr -f md_noPBC.xtc -o rmsf_ligand.xvg 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;1&amp;#39;&lt;/span&gt; | gmx gyrate -s md.tpr -f md_noPBC.xtc -o gyrate.xvg 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;1 13&amp;#39;&lt;/span&gt; | gmx hbond -s md.tpr -f md_noPBC.xtc -num hbond_ligand_protein.xvg -tu ns 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;1 13&amp;#39;&lt;/span&gt; | gmx mindist -f md_noPBC.xtc -s md.tpr -od mindist_prot_lig.xvg -on numcont_prot_lig.xvg -d 0.4
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx distance -f md_noPBC.xtc -s md.tpr -select &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;com of group &amp;#34;Protein&amp;#34; plus com of group &amp;#34;UNL&amp;#34;&amp;#39;&lt;/span&gt; -oall distance_prot_lig.xvg
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;if&lt;/span&gt; grep -q &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;^;energygrps&amp;#34;&lt;/span&gt; md.mdp; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;then&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    sed -i &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;s/^;energygrps/energygrps/&amp;#39;&lt;/span&gt; md.mdp
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;Uncommented energygrps&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;elif&lt;/span&gt; grep -q &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;^energygrps&amp;#34;&lt;/span&gt; md.mdp; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;then&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;energygrps already active, doing nothing&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;else&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;energygrps = Protein UNL&amp;#34;&lt;/span&gt; &amp;gt;&amp;gt; md.mdp
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;Added energygrps&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;fi&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx grompp -f md.mdp -c md.gro -p topol.top -o rerun.tpr 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx mdrun -s rerun.tpr -rerun md_nojump.xtc -e rerun.edr -ntmpi &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; -ntomp &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;printf&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;Coul-SR:Protein-UNL\nLJ-SR:Protein-UNL\n0\n&amp;#34;&lt;/span&gt; | gmx energy -f rerun.edr -o interaction_energy.xvg
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;cp *.xvg /path/to/your/analysis
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;blockquote&gt;
&lt;p&gt;Note: Make sure to run the above after we&amp;rsquo;re done with all of the simulation, otherwise it will bottleneck both. Meaning, it will be slow for the above and also your active simulation. Well, I only have 4 core, so it might be different for yours! It might not matter at all. Above I would use -ntomp 2 if my other simulation is still running but i want to check the result.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3 id=&#34;fx&#34;&gt;Create function to visualize all
  &lt;a href=&#34;#fx&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;details&gt;
&lt;summary&gt;R Code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(tidyverse)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(xvm)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(ggpubr)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot_all &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(path,int&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;prot &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_xvg&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(path,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rmsd_protein.xvg&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ligand &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_xvg&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(path,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rmsd_ligand.xvg&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;rmsf_prot &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_xvg&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(path,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rmsf_protein.xvg&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;rmsf_ligand &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_xvg&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(path,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rmsf_ligand.xvg&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gy &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_xvg&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(path,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;gyrate.xvg&amp;#34;&lt;/span&gt;)) 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(int) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;interaction &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_xvg&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(path,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;interaction_energy.xvg&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot_interaction &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_xvg&lt;/span&gt;(interaction)}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;hbond &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_xvg&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(path,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hbond_ligand_protein.xvg&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mindist &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_lines&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(path,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;mindist_prot_lig.xvg&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;dist &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_lines&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(path,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;distance_prot_lig.xvg&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot_rmsd_prot &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_xvg&lt;/span&gt;(prot)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot_rmsd_ligand &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_xvg&lt;/span&gt;(ligand)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot_rmsf_prot &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_xvg&lt;/span&gt;(rmsf_prot)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot_rmsf_ligand &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_xvg&lt;/span&gt;(rmsf_ligand)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot_gy &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_xvg&lt;/span&gt;(gy)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot_hbond &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(hbond&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;hbond_ligand_protein.xvg&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;data) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;distinct&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggplot&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;`Time &lt;/span&gt;(ns)`,y=`Hydrogen bonds`)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_line&lt;/span&gt;(color &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;red&amp;#34;&lt;/span&gt;, alpha&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0.8&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme_bw&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggtitle&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;Hydrogen bonds&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mindist_list &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mindist[&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(mindist, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;^@|^#&amp;#34;&lt;/span&gt;)] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_split&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mindist_val &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mindist_list))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mindist_list)) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  time[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mindist_list[[i]][1]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mindist_val[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mindist_list[[i]][3]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mindist_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; time, mindist_val &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; mindist_val) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;(time),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         mindist_val &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;(mindist_val))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot_mindist &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mindist_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggplot&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;time,y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;mindist_val)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_point&lt;/span&gt;(color &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;red&amp;#34;&lt;/span&gt;, alpha&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0.7&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme_bw&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggtitle&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;Min Distance&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;dist_list &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; dist[&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(dist, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;^@|^#&amp;#34;&lt;/span&gt;)] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_trim&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_split&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;    &amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; dist_val &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(dist_list))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(dist_list)) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  time[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; dist_list[[i]][1] 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  dist_val[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; dist_list[[i]][2]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;dist_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; time, dist_val &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; dist_val) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;(time),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         dist_val &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;(dist_val))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot_dist &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; dist_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggplot&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;time,y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;dist_val)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_point&lt;/span&gt;(color &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;red&amp;#34;&lt;/span&gt;, alpha&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0.7&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme_bw&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggtitle&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;Distance Between Protein and Ligand&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(int) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plotlist &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(plot_rmsd_prot, plot_rmsd_ligand, plot_rmsf_prot, plot_rmsf_ligand,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                 plot_gy, plot_interaction, plot_hbond, plot_mindist, plot_dist) }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;else {plotlist &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(plot_rmsd_prot, plot_rmsd_ligand, plot_rmsf_prot, plot_rmsf_ligand,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                 plot_gy, plot_hbond, plot_mindist, plot_dist)}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggarrange&lt;/span&gt;(plotlist &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; plotlist, ncol &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;




&lt;h3 id=&#34;tem5&#34;&gt;TEM-5
  &lt;a href=&#34;#tem5&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_all&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tem5_result/&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-6-1.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-6-2.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-6-3.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-6-4.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-6-5.png&#34; width=&#34;1152&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Overall, clavulanic acid and TEM-5 demonstrated stable binding throughout the 60 ns simulation. The protein RMSD plateaued early, confirming stable conformation, ligand rmsf has also plateaud. The total radius of gyration remained largely stable, and both Coulombic and Lennard-Jones interaction energies stayed consistently negative, indicating favorable protein–ligand interactions. Hydrogen bonds were maintained at around 2–4 throughout the simulation, which is adequate for a small molecule ligand. Both the center-of-mass and minimum distances converged to stable low values and remained consistent for the remainder of the trajectory. Collectively, these results support a successful molecular dynamics simulation with stable ligand binding.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;tem5-1.gif&#34; alt=&#34;image&#34; width=&#34;100%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;




&lt;h3 id=&#34;ctxm15&#34;&gt;CTX-M-15
  &lt;a href=&#34;#ctxm15&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_all&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ctxm15_result/&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-7-1.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-7-2.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-7-3.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-7-4.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-7-5.png&#34; width=&#34;1152&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Overall, clavulanic acid and CTX-M-15 also demonstrated stable binding throughout the 50 ns simulation. Although the ligand RMSD shows significant noise — likely due to improper initial coordinate assignment when using the ACPYPE-generated .gro file instead of the docked ligand coordinates — the protein RMSD plateaued and remained stable. This is not a sign of dissociation as a true dissociation (
&lt;a href=&#34;#tem30&#34;&gt;see here&lt;/a&gt;) would look like this. The total radius of gyration was similarly stable. Interaction energies and hydrogen bond counts were consistent throughout the trajectory, indicating sustained non-covalent interactions. Both the minimum distance and center-of-mass distance between the protein and ligand converged and remained low, further supporting stable binding. Also, upon visualizing the &lt;code&gt;md.xtc&lt;/code&gt; in reference to the first frame of pdb, we can see there was at one point the ligand may have dissociated at a short period of time and then reattached.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;ctxm15.gif&#34; alt=&#34;image&#34; width=&#34;100%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;




&lt;h3 id=&#34;tem30&#34;&gt;TEM-30
  &lt;a href=&#34;#tem30&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_all&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tem30_result/&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-8-1.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-8-2.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-8-3.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-8-4.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-8-5.png&#34; width=&#34;1152&#34; /&gt;&lt;/p&gt;
&lt;p&gt;This simulation presents a clear case of true ligand dissociation, indicating that clavulanic acid does not bind well to TEM-30. Notably, the simulation terminated early. Both the protein and ligand RMSD rose continuously without plateauing, though the protein RMSD values may be inflated due to the same coordinate assignment error noted previously. More definitively, both hydrogen bond counts and interaction energies dropped to zero, and the minimum distance as well as the center-of-mass distance between the protein and ligand diverged progressively over time. Collectively, these findings confirm that stable binding was not achieved in this simulation. Below, you can see ligand flies away from the pocket site.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;tem30.gif&#34; alt=&#34;image&#34; width=&#34;100%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;




&lt;h3 id=&#34;kpc2&#34;&gt;KPC2
  &lt;a href=&#34;#kpc2&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_all&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;kpc2_result/&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-9-1.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-9-2.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-9-3.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-9-4.png&#34; width=&#34;1152&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim2/index_files/figure-html/unnamed-chunk-9-5.png&#34; width=&#34;1152&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Finally, the KPC-2 and clavulanic acid simulation also demonstrated stable binding throughout the 50 ns trajectory. As noted previously, the ligand RMSD cannot be accurately interpreted due to the coordinate assignment error. Nevertheless, both interaction energies and hydrogen bond counts remained consistent throughout, and the minimum distance as well as the center-of-mass distance between the protein and ligand converged to lower values and remained stable. Importantly, however, stable binding in this simulation does not imply that clavulanic acid acts as an inhibitor of KPC-2. In fact, KPC-2 is known to hydrolyze clavulanic acid — a covalent enzymatic process that a non-covalent molecular dynamics simulation such as this cannot capture or reflect.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;kpc2-1.gif&#34; alt=&#34;image&#34; width=&#34;100%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;




&lt;h2 id=&#34;finalthought&#34;&gt;Final Thought
  &lt;a href=&#34;#finalthought&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Wow, having the previous workflow really helps with lower the risk of error! I definitely learnt that my mistake of using acpype coordinates have caused significant noise on rmsd ligand and should really use the docked ligand position to create the complex.gro. Still so much to learn! But this is a really cool simulation where we can test the tested hypothesis and see it with our own eyes! I&amp;rsquo;m so grateful and fortunate to be able to be at this time period to learn these without TOO much disappointments. And also so grateful for all these scientists to be able to share these techniques for free! That is truly amazing and inspiring! ❤️&lt;/p&gt;




&lt;h2 id=&#34;opportunities&#34;&gt;Opportunities for improvement
  &lt;a href=&#34;#opportunities&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;I have been mistakenly using acpype gro coordinates to make complex.gro, where the rmsd ligand was way off but everything else seem to be fine. We should be using the docked pose to convert to gro using &lt;code&gt;gmx editconf -f diffdock_pose.pdb -o ligand_positioned.gro&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Learn how to find out if there is consistent non-covalent bond between ligand and protein, how do we then differentiate between the ligand inactivating the protein or that the protein hydrolyzes the ligand like KPC2&lt;/li&gt;
&lt;li&gt;Learn how to perform covalent docking and covalent md simulation&lt;/li&gt;
&lt;li&gt;rewrite md sim pipeline into something easier to automate. currently i have to copy and paste from my previous documentation, mind you it&amp;rsquo;s already REALLY helpful because i was able to do all those in the matter of minutes. Let&amp;rsquo;s see how we can further optimize that, more ergnomics.&lt;/li&gt;
&lt;li&gt;Try colabfold locally to produce protein, might take some time&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;lessons&#34;&gt;Lessons Learnt
  &lt;a href=&#34;#lessons&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;I did not know that CTX-M-15 can bind to clavulanic acid, did the simulation first then realized it bound and verified literature and apparently it does! Simulation is another tool to help to learn something new!&lt;/li&gt;
&lt;li&gt;I change the output md.mdp to 50000 instead of 5000 to reduce bottle neck between cpu and gpu&lt;/li&gt;
&lt;li&gt;use &amp;amp;&amp;amp; chaining to run one production after another&lt;/li&gt;
&lt;li&gt;can independently use 2 or more GPU if motherboard allows&lt;/li&gt;
&lt;li&gt;gromacs automatically chooses a GPU that has better capacity and assign that as 0?&lt;/li&gt;
&lt;li&gt;need to center in order for a good visualization, otherwise you will see protein and ligand might split as it exits the edge of simulation box, it enters from the other end.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you like this article:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;please feel free to send me a 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;comment or visit my other blogs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;please feel free to follow me on 
&lt;a href=&#34;https://bsky.app/profile/kenkoonwong.bsky.social&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;BlueSky&lt;/a&gt;, 
&lt;a href=&#34;https://twitter.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;twitter&lt;/a&gt;, 
&lt;a href=&#34;https://github.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GitHub&lt;/a&gt; or 
&lt;a href=&#34;https://med-mastodon.com/@kenkoonwong&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Mastodon&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;if you would like collaborate please feel free to 
&lt;a href=&#34;https://www.kenkoonwong.com/contact/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;contact me&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>Exploring Molecular Docking &amp; Molecular Dynamic Simulations - A Note To Myself</title>
      <link>https://www.kenkoonwong.com/blog/mdsim/</link>
      <pubDate>Sat, 14 Feb 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.kenkoonwong.com/blog/mdsim/</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;🧬 Explored molecular docking &amp;amp; MD simulations with penicillin binding to PBP2x — lots of stumbling through GROMACS but slowly piecing it together! 💊 Still much more to learn, but we&amp;rsquo;re getting there! 🔬&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2 id=&#34;motivations&#34;&gt;Motivations:
  &lt;a href=&#34;#motivations&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Since last year I&amp;rsquo;ve been fascinated by how scientists are able to use computational methods and simulation of the physical world. Things such as moelcular dynamic is quite amazing! And we can learn by running through these processes. Since we know that penicillin binds to penicillin binding protein 2x, why don&amp;rsquo;t we use that as a stepping stone and see how far we can go? Oh, have you used Claude Code or OpenAI Codex? If not, these are great tools to maybe get to the end first and see the results and then document and learn step by step from 0 to n to get to that final result? I found that extremely helpful because I know that if this works, it should work by going through the steps again and there is light at the end of the tunnel! Also, not to mention, these tools as they&amp;rsquo;re getting to the end, will let you know if your machine is not powerful enough to get there 🤣 Now, buckle up! This is REALLY going to be bumpy. All codes are in &lt;code&gt;bash&lt;/code&gt;. We&amp;rsquo;ll set a pipeline with either python or R in the future for more experiments!&lt;/p&gt;
&lt;p&gt;Just so happen, today is Valentine&amp;rsquo;s Day. Happy Valentine&amp;rsquo;s Day! ❤️ Molecular docking and simulation is like a match making activity 🤣 What a coincidence!&lt;/p&gt;




&lt;h2 id=&#34;disclaimer&#34;&gt;Disclaimer:
  &lt;a href=&#34;#disclaimer&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;I am not a biochemist. This blog serves as a note for reproducibility for my own and only for educational purposes and documentation of what worked and what didn&amp;rsquo;t, so that I won&amp;rsquo;t repeat the same mistake in the future. If you noticed any error, please educate me! Also, most codes below are run in Ubuntu bash, some in R for visualization. Also I used a lot of Claude to help discover, debug, learn.&lt;/em&gt;&lt;/p&gt;




&lt;h2 id=&#34;objectives&#34;&gt;Objectives:
  &lt;a href=&#34;#objectives&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#what&#34;&gt;What is Molecular Docking &amp;amp; Molecular Dynamic Simulations?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#docking&#34;&gt;Molecular Docking&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#install-vina&#34;&gt;Autodock Vina Installation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#download&#34;&gt;Download&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#convert&#34;&gt;Convert Protein and Ligand Structures&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#run-vina&#34;&gt;Run Vina&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#mdsim&#34;&gt;Molecular Dynamic Simulation&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#install&#34;&gt;Gromacs Installations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#protein&#34;&gt;Prepare Protein&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#ligand&#34;&gt;Prepare Ligand&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#restraint&#34;&gt;Create Position Restraints For The Ligand&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#edit&#34;&gt;Edit topol.top&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#complex&#34;&gt;Create The Complex&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#box&#34;&gt;Definte Simulation Box&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#solvent&#34;&gt;Create Solvent&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#ion&#34;&gt;Add Ion&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#em&#34;&gt;Energy Minimization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#nvt&#34;&gt;NVT Equilibration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#npt&#34;&gt;NPT Equilibration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#prod&#34;&gt;Production&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#assess&#34;&gt;Asessment&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#rmsd&#34;&gt;RMSD&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#rmsf&#34;&gt;RMSF&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#gyrate&#34;&gt;Radisu of Gyration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#interaction&#34;&gt;Interaction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#hbond&#34;&gt;H bond&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#distance&#34;&gt;Binding Distance&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#viz&#34;&gt;Visual Inspection&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#opportunities&#34;&gt;Opportunities for Improvement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#lessons&#34;&gt;Lessons learnt&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Please take a look at this 
&lt;a href=&#34;http://www.mdtutorials.com/gmx/index.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Gromacs Tutorials&lt;/a&gt;, has a lot o fanstastic code to help beginners like me!&lt;/p&gt;




&lt;h2 id=&#34;what&#34;&gt;What Is Molecular Docking &amp;amp; Molecular Dynamic Simulations?
  &lt;a href=&#34;#what&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Molecular docking is basically computational lock and key exploration - you&amp;rsquo;re predicting how a small molecule (like a medicine) will bind to a protein target by testing different orientations and conformations to find the lowest energy binding pose. It&amp;rsquo;s a static snapshot: protein and ligand are treated as relatively rigid (or semi-flexible), and you get binding scores that estimate affinity. Molecular dynamics (MD) simulation takes it way further - it&amp;rsquo;s like watching a molecular movie where you simulate the actual physics of atoms moving over time (picoseconds to microseconds), accounting for solvent, temperature, and all the wiggling and conformational changes that happen in real biological systems. Docking gives you the &amp;ldquo;where and how tight,&amp;rdquo; while MD gives you the &amp;ldquo;what happens next&amp;rdquo; - stability, induced fit, binding pathways, and whether that docked pose actually holds up when everything&amp;rsquo;s allowed to move realistically. You often use docking first to generate binding hypotheses, then validate or refine them with MD.&lt;/p&gt;




&lt;h2 id=&#34;docking&#34;&gt;Molecular Docking
  &lt;a href=&#34;#docking&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;




&lt;h3 id=&#34;install-vina&#34;&gt;Installation
  &lt;a href=&#34;#install-vina&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sudo apt update
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sudo apt install autodock-vina &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# I have version `1.2.5`. &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sudo apt install openbabel &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# I have version 3.1.0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3 id=&#34;download&#34;&gt;Download
  &lt;a href=&#34;#download&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Now, we have to download ligand and protein structures. Let&amp;rsquo;s go to 
&lt;a href=&#34;https://www.rcsb.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;RCSB&lt;/a&gt; and download our protein.&lt;/p&gt;
&lt;p&gt;We&amp;rsquo;ll be working with 
&lt;a href=&#34;https://www.rcsb.org/structure/5OJ0&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Penicillin-binding Protein 2x&lt;/a&gt;.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;https://cdn.rcsb.org/images/structures/5oj0_assembly-1.jpeg&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;This one is from Streptococcus pneumoniae with cefepime already bound to it. So at least we know where is the binding site and we can estimate using the coordinates on our penicillin.&lt;/p&gt;
&lt;p&gt;Download the 
&lt;a href=&#34;https://files.rcsb.org/download/5OJ0.pdb&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;pdb&lt;/a&gt;. If we use 
&lt;a href=&#34;&#34;&gt;ChimeraX&lt;/a&gt; to visualize the pdb, it will look something like this. With the ligand (cefepime) highlighted in green.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;pbp2x.gif&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;If we simply just read the pdb file of what we just downloaded, as you scroll through you will see &lt;code&gt;ATOM&lt;/code&gt; means standard protein/nucleic acid atoms and &lt;code&gt;HETATM&lt;/code&gt; = heteroatoms like ligands, waters or ions.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;pdb.png&#34; alt=&#34;image&#34; width=&#34;100%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;In the middle where we see lots of columns of numbers, the &lt;code&gt;xyz&lt;/code&gt; coordinates of where ligand (cefepime aka 9WT) is located on this protein. We will use this coordinate to set up our docking later on.
Take note that in the middle where we see lots of columns of numbers, the &lt;code&gt;xyz&lt;/code&gt; coordinates of where ligand (cefepime aka 9WT) is located on this protein. The coordinates are around &lt;code&gt;x=30&lt;/code&gt;, &lt;code&gt;y=-15&lt;/code&gt;, &lt;code&gt;z=50&lt;/code&gt;. Remember this, because we will need this for docking.&lt;/p&gt;
&lt;p&gt;Next we&amp;rsquo;ll download a 
&lt;a href=&#34;https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/CID/5904/record/SDF?record_type=3d&amp;amp;response_type=save&amp;amp;response_basename=Conformer3D_COMPOUND_CID_5904&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;penicillin G&lt;/a&gt; where we will go to 
&lt;a href=&#34;https://pubchem.ncbi.nlm.nih.gov/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;PubChem&lt;/a&gt; instead.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;pcn.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;




&lt;h3 id=&#34;convert&#34;&gt;Convert Protein and Ligand Structures
  &lt;a href=&#34;#convert&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;This is not always the case, but because of our protein already contains a ligand, let&amp;rsquo;s remove that ligand and createa clean &lt;code&gt;pbp2x_protein.pdb&lt;/code&gt; then &lt;code&gt;pdbqt&lt;/code&gt; which is the format that vina uses.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# remove ligand from protein pdb&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;grep &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;^ATOM&amp;#34;&lt;/span&gt; 5OJ0.pdb &amp;gt; pbp2x_protein.pdb
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# convert pdb and sif files to pdbqt for autodock&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;obabel pbp2x_protein.pdb -O pbp2x_protein.pdbqt -xr &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#(-xr removes non-polar hydrogens)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;obabel Conformer3D_COMPOUND_CID_5904.sdf -O pcn.pdbqt -xh &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#(-xh add hydrogen needed for proper docking)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3 id=&#34;run-vina&#34;&gt;Run Vina
  &lt;a href=&#34;#run-vina&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;vina --receptor pbp2x_protein.pdbqt --ligand pcn.pdbqt &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;     --center_x 30.0 --center_y -15.0 --center_z 50.0 &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;     --size_x 30.0 --size_y 30.0 --size_z 30.0 &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;     --exhaustiveness &lt;span style=&#34;color:#099&#34;&gt;16&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;     --num_modes &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;     --out pcn_docked.pdbqt 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## turn pdbqt back to pdb&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;obabel pcn_docked.pdbqt -O pcn_docked.pdb
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;code&gt;--num_nodes&lt;/code&gt; will only show the top 1 predicted pose and turn it into &lt;code&gt;pcn_docked.pdbqt&lt;/code&gt;. We will convert this back to pdb for visualization and later use in molecular dynamic simulations.&lt;/p&gt;
&lt;p&gt;After it&amp;rsquo;s done. It will look something like this. For more question and answer information 
&lt;a href=&#34;https://autodock-vina.readthedocs.io/en/latest/faq.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;please see this&lt;/a&gt;. It has a lot of answers, one of them is what is a good search size?&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;You should probably avoid search spaces bigger than 30 x 30 x 30 Angstrom, unless you also increase “–exhaustiveness”.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;pcn_docked.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;With affinity of &lt;code&gt;-6.83 kcal/mol&lt;/code&gt;. Is this considered OK? From my shallow reading, the smaller the better. General &amp;lt; -7 is good (some said -10 arbitarily). If it&amp;rsquo;s &amp;gt; -5, it means it&amp;rsquo;s less likely to bind. So this is borderline, but we can still use it for molecular dynamic simulations to see if it holds up. But before that, let&amp;rsquo;s try an experiment, what if we dock a compound that we know doesn&amp;rsquo;t bind to this protein, say aspirin?&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;vina_aspirin.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;Alright! Pretty good! &lt;code&gt;-4.968&lt;/code&gt; is the predicted binding affinity of aspirin to our protein target. Much less than our penicillin! Ideally with virtual screening, we want to have a lot of ligands that we want to run through and find the top hits.&lt;/p&gt;




&lt;h2 id=&#34;mdsim&#34;&gt;Molecular Dynamic Simulation
  &lt;a href=&#34;#mdsim&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;




&lt;h3 id=&#34;install&#34;&gt;Install Gromacs
  &lt;a href=&#34;#install&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Install dependencies&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sudo apt update
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sudo apt install -y build-essential cmake git libfftw3-dev libopenmpi-dev wget
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Install NVIDIA drivers and CUDA (if not already installed)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sudo apt install -y nvidia-driver-535 nvidia-cuda-toolkit
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Download GROMACS&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;cd&lt;/span&gt; ~
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;wget https://ftp.gromacs.org/gromacs/gromacs-2024.4.tar.gz
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;tar xfz gromacs-2024.4.tar.gz
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;cd&lt;/span&gt; gromacs-2024.4
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Build with GPU support&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mkdir build
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;cd&lt;/span&gt; build
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;cmake .. -DGMX_BUILD_OWN_FFTW&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;ON -DGMX_GPU&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;CUDA -DCUDA_TOOLKIT_ROOT_DIR&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;/usr/lib/cuda -DREGRESSIONTEST_DOWNLOAD&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;ON
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;make -j&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$(&lt;/span&gt;nproc&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sudo make install
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Add GROMACS to PATH permanently&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;source /usr/local/gromacs/bin/GMXRC&amp;#39;&lt;/span&gt; &amp;gt;&amp;gt; ~/.bashrc
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;source&lt;/span&gt; ~/.bashrc
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Verify installation&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx --version
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Test GPU detection&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx detect
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3 id=&#34;protein&#34;&gt;Prepare Protein For Gromac
  &lt;a href=&#34;#protein&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx pdb2gmx -f pbp2x_protein.pdb -o pbp2x_processed.gro -water tip3p -ff amber99sb-ildn
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;What this does is that it converts your protein PDB file into a simulation-ready format by adding hydrogen atoms, applying the AMBER99SB-ILDN force field parameters, and selecting the TIP3P water model. It outputs a processed structure file &lt;code&gt;pbp2x_processed.gro&lt;/code&gt; and a topology file &lt;code&gt;topol.top&lt;/code&gt; that contains all the molecular interaction parameters needed for running molecular dynamics simulations.&lt;/p&gt;




&lt;h3 id=&#34;ligand&#34;&gt;Prepare Ligand
  &lt;a href=&#34;#ligand&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;acpype -i pcn_docked.mol2 -n &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt; -a gaff2 -c bcc
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This will create a directory called &lt;code&gt;pcn_docked.acpype&lt;/code&gt;. what &lt;code&gt;acpype&lt;/code&gt; does is that it takes your ligand structure (in this case &lt;code&gt;pcn_docked.pdb&lt;/code&gt;) and generates the necessary topology and coordinate files for GROMACS. It uses the Antechamber tool to assign atom types and partial charges based on the AMBER force field, and outputs files like &lt;code&gt;pcn_docked_GMX.itp&lt;/code&gt; (topology) and &lt;code&gt;pcn_docked_GMX.gro&lt;/code&gt; (coordinates) that can be directly included in your GROMACS simulations.&lt;/p&gt;




&lt;h3 id=&#34;restraint&#34;&gt;Create Position Restraints For The Ligand
  &lt;a href=&#34;#restraint&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt; | gmx genrestr -f pcn_docked.acpype/pcn_docked_GMX.gro &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;             -o posre_ligand.itp &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;             -fc &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;             
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## echo 0 means select &amp;#34;system&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;blockquote&gt;
&lt;p&gt;Why do we need to do this? Because we want to make sure that our ligand doesn&amp;rsquo;t fly away during the equilibration phase. By creating a position restraint file for the ligand, we&amp;rsquo;re essentially telling GROMACS to apply a strong force (1000 kJ/mol/nm) to keep the ligand in place while the rest of the system (protein, water, ions) relaxes around it. This is especially important if we want to see how well our docked pose holds up under more realistic conditions without it drifting away from the binding site. I will use this ⛓️‍💥 to remind us where it is applied&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3 id=&#34;edit&#34;&gt;Edit topol.top
  &lt;a href=&#34;#edit&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;By now you should have &lt;code&gt;topol.top&lt;/code&gt; file. Open it and scroll down to&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;; Include Position restraint file
#ifdef POSRES
#include &amp;#34;posre.itp&amp;#34;
#endif
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;then add these chunk after it&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;; Include ligand position restraint file
#ifdef POSRES_LIGAND
#include &amp;#34;posre_ligand.itp&amp;#34;    ; remember this ⛓️‍💥 ?
#endif
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Do you remember &lt;code&gt;posre_ligand.itp&lt;/code&gt; that we created 
&lt;a href=&#34;#restraint&#34;&gt;here&lt;/a&gt; ? We want to include this in &lt;code&gt;topol.top&lt;/code&gt; so that when we&amp;rsquo;re at the phase of &lt;code&gt;NVT&lt;/code&gt; and &lt;code&gt;NPT&lt;/code&gt;, it will call this file to apply the restraint. We&amp;rsquo;re not done yet! More editing! Scary, right? I know.&lt;/p&gt;
&lt;p&gt;After this, include these on your &lt;code&gt;topol.top&lt;/code&gt; file as well.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;#include &amp;#34;amber99sb-ildn.ff/forcefield.itp&amp;#34;
#include &amp;#34;pcn_docked.acpype/pcn_docked_GMX.itp&amp;#34;  ; Include ligand topology

[ system ]
Protein-Ligand Complex

[ molecules ]
Protein_chain_A    1
pcn_docked                1  ; Your ligand name from pcn_docked_GMX.itp
&lt;/code&gt;&lt;/pre&gt;&lt;blockquote&gt;
&lt;p&gt;Take note that in [molecules] you have to insert your specific ligand name in your &lt;code&gt;.itp&lt;/code&gt; file made from &lt;code&gt;acpype&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3 id=&#34;complex&#34;&gt;Create The Complex
  &lt;a href=&#34;#complex&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;I&amp;rsquo;ve written myself a little cheat to do this by code&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;python3 &lt;span style=&#34;color:#d14&#34;&gt;&amp;lt;&amp;lt; &amp;#39;PYEOF&amp;#39;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   # Combine protein and ligand GRO files into complex.gro                      
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;                                         
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   protein = &amp;#34;/your/path/pbp_test/pbp2x_processed.gro&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   ligand = &amp;#34;/your/path/pbp_test/pcn_docked.acpype/pcn_docked_GMX.gro&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   output = &amp;#34;/your/path/pbp_test/complex.gro&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   with open(protein) as f:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       prot_lines = f.readlines()
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   with open(ligand) as f:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       lig_lines = f.readlines()
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   # Parse protein
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   prot_title = prot_lines[0].strip()
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   prot_natoms = int(prot_lines[1].strip())
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   prot_atoms = prot_lines[2:2+prot_natoms]
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   prot_box = prot_lines[2+prot_natoms].strip()
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   # Parse ligand
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   lig_natoms = int(lig_lines[1].strip())
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   lig_atoms = lig_lines[2:2+lig_natoms]
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   total_atoms = prot_natoms + lig_natoms
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   # GRO format: residue number (5 chars), residue name (5 chars), atom name
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   (5 chars),
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   # atom number (5 chars), x y z (8.3f each)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   # Atom numbers wrap at 99999
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   with open(output, &amp;#39;w&amp;#39;) as f:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       f.write(f&amp;#34;{prot_title} with ligand\n&amp;#34;)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       f.write(f&amp;#34;{total_atoms}\n&amp;#34;)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       # Write protein atoms as-is
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       for line in prot_atoms:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           f.write(line)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       # Write ligand atoms, renumbering atoms continuing from protein
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       # Get the last residue number from protein
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       last_prot_resnum = int(prot_atoms[-1][:5])
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       new_resnum = last_prot_resnum + 1
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       for i, line in enumerate(lig_atoms):
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           # Renumber residue and atom
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           atom_num = (prot_natoms + i + 1) % 100000
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           # Format: %5d%-5s%5s%5d%8.3f%8.3f%8.3f
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           resname = line[5:10]  # residue name
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           atomname = line[10:15]  # atom name
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           x = line[20:28]
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           y = line[28:36]
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           z = line[36:44]
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           vel = line[44:] if len(line) &amp;gt; 44 else &amp;#34;&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           new_line =
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   f&amp;#34;{new_resnum:5d}{resname}{atomname}{atom_num:5d}{x}{y}{z}&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           if vel.strip():
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;               new_line += vel
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           else:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;               new_line += &amp;#34;\n&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;           f.write(new_line)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       # Write box vectors from protein
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       f.write(f&amp;#34;{prot_box}\n&amp;#34;)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   print(f&amp;#34;Combined {prot_natoms} protein + {lig_natoms} ligand =
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   {total_atoms} total atoms&amp;#34;)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   print(&amp;#34;Written to complex.gro&amp;#34;)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   # Verify
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   with open(output) as f:
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;       lines = f.readlines()
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   print(f&amp;#34;Output file: {len(lines)} lines&amp;#34;)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   print(f&amp;#34;Header: {lines[0].strip()}&amp;#34;)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   print(f&amp;#34;Atom count: {lines[1].strip()}&amp;#34;)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   print(f&amp;#34;Last atom line: {lines[-2].strip()}&amp;#34;)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   print(f&amp;#34;Box: {lines[-1].strip()}&amp;#34;)
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;   PYEOF&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: The above code may not work. Might have to rewrite it to make it more robust&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;We essentially want to combine our &lt;code&gt;pbp2x_processed.gro&lt;/code&gt;, 
&lt;a href=&#34;#prepare&#34;&gt;remember this?&lt;/a&gt;, and also our &lt;code&gt;pcn_docked.gro&lt;/code&gt;, into a single &lt;code&gt;protein-ligand complex&lt;/code&gt; and call it &lt;code&gt;complex.gro&lt;/code&gt;.&lt;/p&gt;




&lt;h4 id=&#34;convert-oringally-docked-pose-to-gro&#34;&gt;Convert Oringally Docked Pose To gro
  &lt;a href=&#34;#convert-oringally-docked-pose-to-gro&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx editconf -f pcn_docked.pdb -o pcn_docked.gro
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;You go to this:&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;pbp2x_processed_gro.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;Then copy this:&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;pcn_docked_gro.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;Change the total atoms numbers &lt;code&gt;100842&lt;/code&gt; from 10041 + 41 = 10082. Then insert the &lt;code&gt;pcn_docked.gro&lt;/code&gt; coordinates right after the protein&amp;rsquo;s coordinates like so.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;complex.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Warning! Note: Do not use acpype gro because the coordinates may not be the same, use this something like this &lt;code&gt;gmx editconf -f diffdock_pose.pdb -o ligand_positioned.gro&lt;/code&gt; from your original docked pose pdb.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3 id=&#34;box&#34;&gt;Define Simulation Box
  &lt;a href=&#34;#box&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx editconf -f complex.gro -o complex_box.gro -c -d 1.0 -bt dodecahedron
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Before we can do anything useful, we need to give our protein-ligand complex a &amp;ldquo;home&amp;rdquo; — and that&amp;rsquo;s exactly what this step does. We&amp;rsquo;re using &lt;code&gt;gmx editconf&lt;/code&gt; to center the complex and build a box around it, leaving at least 1.0 nm of breathing room between the protein and the box walls so things don&amp;rsquo;t awkwardly interact with themselves across periodic boundaries. We&amp;rsquo;re going with a dodecahedron shape instead of a plain cube because it&amp;rsquo;s more sphere-like, which means we need to fill it with about 30% less water — and less water means faster simulations. I think&amp;hellip;&lt;/p&gt;




&lt;h3 id=&#34;solvate&#34;&gt;Solvate The System
  &lt;a href=&#34;#solvate&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx solvate -cp complex_box.gro -cs spc216.gro -o complex_solv.gro -p topol.top
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;A protein floating in an empty box isn&amp;rsquo;t very biologically realistic, so let&amp;rsquo;s add some water. &lt;code&gt;gmx solvate&lt;/code&gt; takes the pre-equilibrated SPC/E water model (spc216.gro) and floods our box with water molecules around the complex. It also automatically updates &lt;code&gt;topol.top&lt;/code&gt; to keep track of how many water molecules were added&lt;/p&gt;




&lt;h3 id=&#34;ion&#34;&gt;Add Ions
  &lt;a href=&#34;#ion&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Make sure to create &lt;code&gt;ion.mdp&lt;/code&gt; before hand with the following parameter&lt;/p&gt;




&lt;h4 id=&#34;ionmdp&#34;&gt;ion.mdp
  &lt;a href=&#34;#ionmdp&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;; ion.mdp - Parameters for adding ions

; Run control
integrator = steep        ; Steepest descent minimization
nsteps     = 50000        ; Maximum number of steps
emtol      = 1000.0       ; Convergence when max force &amp;lt; 1000 kJ/mol/nm
emstep     = 0.01         ; Initial step size (nm)

; Output control
nstlog     = 500          ; Frequency to write to log file
nstenergy  = 500          ; Frequency to write energies

; Neighbor searching
cutoff-scheme = Verlet
ns-type       = grid
nstlist       = 10
rlist         = 1.0       ; Short-range cutoff (nm)

; Electrostatics
coulombtype   = PME       ; Particle Mesh Ewald
rcoulomb      = 1.0       ; Coulomb cutoff (nm)

; Van der Waals
vdwtype       = Cut-off
rvdw          = 1.0       ; VdW cutoff (nm)

; Periodic boundary conditions
pbc           = xyz       ; 3D periodic boundaries

; Temperature and pressure (not really used, but good practice)
tcoupl        = no
pcoupl        = no
gen-vel       = no
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Then to add ions&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx grompp -f ion.mdp -c complex_solv.gro -p topol.top -o ion.tpr -maxwarn &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;15&lt;/span&gt; | gmx genion -s ion.tpr -o complex_solv_ions.gro -p topol.top -pname NA -nname  CL -neutral 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## select 15 or SOL&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Our solvated system is probably carrying a net charge at this point, and running a simulation like that is a recipe for trouble. To fix this, we first prep a run input file with &lt;code&gt;gmx grompp&lt;/code&gt;, then use &lt;code&gt;gmx genion&lt;/code&gt; to swap out some water molecules for sodium (NA) and chloride (CL) ions. The &lt;code&gt;-neutral&lt;/code&gt; flag handles the math for you and adds just enough ions to zero out the total charge.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: I learnt that -maxwarn can sometimes be a problem, if you use it too routinely, eventually something will break either during EM, NVT, NPT, or production. I try to debug it when there is a warning to make sure it doesn&amp;rsquo;t cause a problem later on.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3 id=&#34;em&#34;&gt;Energy Minimization
  &lt;a href=&#34;#em&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Make sure to create &lt;code&gt;em.mdp&lt;/code&gt; before hand with the following parameter&lt;/p&gt;




&lt;h4 id=&#34;emmdp&#34;&gt;em.mdp
  &lt;a href=&#34;#emmdp&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;; Energy minimization parameters
integrator  = steep         ; Steepest descent minimization
emtol       = 1000.0        ; Stop when max force &amp;lt; 1000.0 kJ/mol/nm
emstep      = 0.01          ; Initial step size
nsteps      = 50000         ; Maximum number of steps

; Output control
nstlog      = 100           ; Write to log file every 100 steps
nstenergy   = 100           ; Write energies every 100 steps

; Neighbor searching
cutoff-scheme = Verlet
nstlist     = 10
ns_type     = grid
pbc         = xyz           ; Periodic boundary conditions

; Electrostatics
coulombtype = PME
rcoulomb    = 1.0

; Van der Waals
vdwtype     = Cut-off
rvdw        = 1.0

; Temperature and pressure coupling (off for EM)
tcoupl      = no
pcoupl      = no
&lt;/code&gt;&lt;/pre&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx grompp -f em.mdp -c complex_solv_ions.gro -p topol.top -o em.tpr
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx mdrun -v -deffnm em
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;At this point the system is solvated and neutralized, but it&amp;rsquo;s probably a bit of a mess geometrically — atoms may be too close together, bonds at weird angles, that sort of thing. Energy minimization is basically the &amp;ldquo;calm down&amp;rdquo; step where we let GROMACS iron out all those clashes and bad geometries before we start any real dynamics. We&amp;rsquo;re using the steepest descent algorithm, which just keeps nudging atoms downhill on the energy landscape until the forces are small enough that we&amp;rsquo;re happy. No actual physics happening here — just cleaning up the structure so we have a solid starting point.&lt;/p&gt;




&lt;h3 id=&#34;nvt-equilibration&#34;&gt;NVT Equilibration
  &lt;a href=&#34;#nvt-equilibration&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Make sure to create &lt;code&gt;nvt.mdp&lt;/code&gt; before hand with the following parameter&lt;/p&gt;




&lt;h4 id=&#34;nvtmdp&#34;&gt;nvt.mdp
  &lt;a href=&#34;#nvtmdp&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;; NVT EQUILIBRATION
; Position restraints on protein and ligand
define = -DPOSRES -DPOSRES_LIGAND.  ; remember this? ⛓️‍💥

; Run parameters
integrator              = md        ; leap-frog integrator
nsteps                  = 50000     ; 2 * 50000 = 100 ps
dt                      = 0.002     ; 2 fs
; Output control
nstxout                 = 500       ; save coordinates every 1.0 ps
nstvout                 = 500       ; save velocities every 1.0 ps
nstenergy               = 500       ; save energies every 1.0 ps
nstlog                  = 500       ; update log file every 1.0 ps
nstxout-compressed      = 500       ; save compressed coordinates every 1.0 ps
compressed-x-grps       = System    ; save the whole system
; Bond parameters
continuation            = no        ; first dynamics run
constraint_algorithm    = lincs     ; holonomic constraints 
constraints             = h-bonds   ; bonds involving H are constrained
lincs_iter              = 1         ; accuracy of LINCS
lincs_order             = 4         ; also related to accuracy
; Nonbonded settings 
cutoff-scheme           = Verlet    ; Buffered neighbor searching
ns_type                 = grid      ; search neighboring grid cells
nstlist                 = 10        ; 20 fs, largely irrelevant with Verlet
rcoulomb                = 1.0       ; short-range electrostatic cutoff (in nm)
rvdw                    = 1.0       ; short-range van der Waals cutoff (in nm)
DispCorr                = EnerPres  ; account for cut-off vdW scheme
; Electrostatics
coulombtype             = PME       ; Particle Mesh Ewald for long-range electrostatics
pme_order               = 4         ; cubic interpolation
fourierspacing          = 0.16      ; grid spacing for FFT
; Temperature coupling
tcoupl                  = V-rescale             ; modified Berendsen thermostat
tc-grps                 = Protein Non-Protein   ; two coupling groups - more accurate
tau_t                   = 0.1     0.1           ; time constant, in ps
ref_t                   = 300     300           ; reference temperature, one for each group, in K
; Pressure coupling
pcoupl                  = no        ; no pressure coupling in NVT
; Periodic boundary conditions
pbc                     = xyz       ; 3-D PBC
; Velocity generation
gen_vel                 = yes       ; assign velocities from Maxwell distribution
gen_temp                = 300       ; temperature for Maxwell distribution
gen_seed                = -1        ; generate a random seed
&lt;/code&gt;&lt;/pre&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx grompp -f nvt.mdp -c em.gro -r em.gro -p topol.top -o nvt.tpr
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx mdrun -deffnm nvt -nb gpu -pme gpu -bonded gpu
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# or with cpu&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx mdrun -deffnm nvt
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Now the real fun begins — sort of. Before we let everything run free, we need to carefully bring the system up to temperature while keeping the protein and ligand held in place with position restraints. This NVT (constant volume and temperature) run heats things up to 300 K over 100 ps using the V-rescale thermostat, while the water and ions get to move around and settle in naturally. Think of it like slowly warming up before a workout.&lt;/p&gt;




&lt;h3 id=&#34;npt-equilibration&#34;&gt;NPT Equilibration
  &lt;a href=&#34;#npt-equilibration&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Make sure to create &lt;code&gt;npt.mdp&lt;/code&gt; before hand with the following parameter&lt;/p&gt;




&lt;h4 id=&#34;nptmdp&#34;&gt;npt.mdp
  &lt;a href=&#34;#nptmdp&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;; NPT EQUILIBRATION
; Position restraints on protein and ligand
define = -DPOSRES -DPOSRES_LIGAND.  ; remember this? ⛓️‍💥

; Run parameters
integrator              = md        ; leap-frog integrator
nsteps                  = 50000     ; 2 * 50000 = 100 ps
dt                      = 0.002     ; 2 fs
; Output control
nstxout                 = 500       ; save coordinates every 1.0 ps
nstvout                 = 500       ; save velocities every 1.0 ps
nstenergy               = 500       ; save energies every 1.0 ps
nstlog                  = 500       ; update log file every 1.0 ps
nstxout-compressed      = 500       ; save compressed coordinates every 1.0 ps
compressed-x-grps       = System    ; save the whole system
; Bond parameters
continuation            = yes       ; continuing from NVT
constraint_algorithm    = lincs     ; holonomic constraints 
constraints             = h-bonds   ; bonds involving H are constrained
lincs_iter              = 1         ; accuracy of LINCS
lincs_order             = 4         ; also related to accuracy
; Nonbonded settings 
cutoff-scheme           = Verlet    ; Buffered neighbor searching
ns_type                 = grid      ; search neighboring grid cells
nstlist                 = 10        ; 20 fs, largely irrelevant with Verlet
rcoulomb                = 1.0       ; short-range electrostatic cutoff (in nm)
rvdw                    = 1.0       ; short-range van der Waals cutoff (in nm)
DispCorr                = EnerPres  ; account for cut-off vdW scheme
; Electrostatics
coulombtype             = PME       ; Particle Mesh Ewald for long-range electrostatics
pme_order               = 4         ; cubic interpolation
fourierspacing          = 0.16      ; grid spacing for FFT
; Temperature coupling
tcoupl                  = V-rescale             ; modified Berendsen thermostat
tc-grps                 = Protein Non-Protein   ; two coupling groups - more accurate
tau_t                   = 0.1     0.1           ; time constant, in ps
ref_t                   = 300     300           ; reference temperature, one for each group, in K
; Pressure coupling
pcoupl                  = Parrinello-Rahman     ; Pressure coupling on in NPT
pcoupltype              = isotropic             ; uniform scaling of box vectors
tau_p                   = 2.0                   ; time constant, in ps
ref_p                   = 1.0                   ; reference pressure, in bar
compressibility         = 4.5e-5                ; isothermal compressibility of water, bar^-1
refcoord_scaling        = com                   ; scale center of mass of reference coordinates
; Periodic boundary conditions
pbc                     = xyz       ; 3-D PBC
; Velocity generation
gen_vel                 = no        ; velocity generated in NVT
&lt;/code&gt;&lt;/pre&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx grompp -f npt.mdp -c nvt.gro -r nvt.gro -t nvt.cpt -p topol.top -o npt.tpr
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx mdrun -deffnm npt -nb gpu -pme gpu -bonded gpu
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# or with cpu&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx mdrun -deffnm npt
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Temperature is sorted, now let&amp;rsquo;s get the pressure and density right too. In this NPT (constant pressure and temperature) step, we swap in the Parrinello-Rahman barostat to let the box size adjust until the pressure stabilizes at 1.0 bar. The protein and ligand are still restrained here — we&amp;rsquo;re just letting the solvent finish getting comfortable. This step is what makes sure your water density is physically reasonable before you remove the training wheels and start production. Another 100 ps and you&amp;rsquo;re good to go.&lt;/p&gt;




&lt;h3 id=&#34;prod&#34;&gt;Run Production
  &lt;a href=&#34;#prod&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Make sure to create &lt;code&gt;md.mdp&lt;/code&gt; before hand with the following parameter&lt;/p&gt;




&lt;h4 id=&#34;mdmdp&#34;&gt;md.mdp
  &lt;a href=&#34;#mdmdp&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;; PRODUCTION MD
; Run parameters
integrator              = md        ; leap-frog integrator
nsteps                  = 5000000   ; 2 * 5000000 = 10 ns (10,000 ps)
dt                      = 0.002     ; 2 fs
; Output control
nstxout                 = 0         ; suppress bulky .trr file by specifying 0
nstvout                 = 0         ; suppress bulky .trr file by specifying 0
nstenergy               = 5000      ; save energies every 10 ps
nstlog                  = 5000      ; update log file every 10 ps
nstxout-compressed      = 5000      ; save compressed coordinates every 10 ps
compressed-x-grps       = System    ; save the whole system
compressed-x-precision  = 1000      ; precision with which to write to the compressed trajectory file
; energygrps            = Protein UNL ; turn this back on after gpu mdrun  
; Bond parameters
continuation            = yes       ; continuing from NPT
constraint_algorithm    = lincs     ; holonomic constraints 
constraints             = h-bonds   ; bonds involving H are constrained
lincs_iter              = 1         ; accuracy of LINCS
lincs_order             = 4         ; also related to accuracy
; Nonbonded settings 
cutoff-scheme           = Verlet    ; Buffered neighbor searching
ns_type                 = grid      ; search neighboring grid cells
nstlist                 = 10        ; 20 fs, largely irrelevant with Verlet
rcoulomb                = 1.0       ; short-range electrostatic cutoff (in nm)
rvdw                    = 1.0       ; short-range van der Waals cutoff (in nm)
DispCorr                = EnerPres  ; account for cut-off vdW scheme
; Electrostatics
coulombtype             = PME       ; Particle Mesh Ewald for long-range electrostatics
pme_order               = 4         ; cubic interpolation
fourierspacing          = 0.16      ; grid spacing for FFT
; Temperature coupling
tcoupl                  = V-rescale             ; modified Berendsen thermostat
tc-grps                 = Protein Non-Protein   ; two coupling groups - more accurate
tau_t                   = 0.1     0.1           ; time constant, in ps
ref_t                   = 300    300           ; reference temperature, one for each group, in K
; Pressure coupling
pcoupl                  = Parrinello-Rahman     ; Pressure coupling on in NPT
pcoupltype              = isotropic             ; uniform scaling of box vectors
tau_p                   = 2.0                   ; time constant, in ps
ref_p                   = 1.0                   ; reference pressure, in bar
compressibility         = 4.5e-5                ; isothermal compressibility of water, bar^-1
; Periodic boundary conditions
pbc                     = xyz       ; 3-D PBC
; Velocity generation
gen_vel                 = no        ; continuing from NPT
&lt;/code&gt;&lt;/pre&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -o md.tpr
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx mdrun -deffnm md -nb gpu -pme gpu
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# or cpu&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx mdrun -deffnm md
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# if ctrl+c and want to resume/continue&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx mdrun -v -deffnm md -nb gpu -pme gpu -cpi md.cpt -append
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This is what we&amp;rsquo;ve been working toward. Restraints are off, the system is equilibrated, and we&amp;rsquo;re letting the simulation run freely for 10 nanoseconds. Every 10 ps the coordinates get saved to a compressed .xtc trajectory file for analysis later. Sit back, let it cook, and get ready for the analysis phase.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: The above default is 10 ns. We did extend to 50 ns to further assess the rmsd issue we saw initially (not documented here) (see note how to do so). Also! These are quite memory intensive! Make sure your harddrive has lots of storage!&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2 id=&#34;assess&#34;&gt;Asessment
  &lt;a href=&#34;#assess&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Run this first to re-center&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Step 1: Make molecules whole first&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;0&amp;#34;&lt;/span&gt; | gmx trjconv -s md.tpr -f md.xtc -o md_nojump.xtc -pbc nojump
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Step 2: Center on protein and apply mol PBC&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;1 0&amp;#34;&lt;/span&gt; | gmx trjconv -s md.tpr -f md_nojump.xtc -o md_noPBC.xtc -pbc mol -center
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3 id=&#34;rmsd&#34;&gt;RMSD — Root Mean Square Deviation
  &lt;a href=&#34;#rmsd&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;What it measures: How much a structure deviates from a reference (usually the starting, post-equilibration frame).&lt;/p&gt;
&lt;p&gt;A stable complex should reach a plateau. Continuous drift means the system hasn&amp;rsquo;t equilibrated or the ligand is dissociating.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Protein backbone RMSD&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;4 4&amp;#39;&lt;/span&gt; | gmx rms -s md.tpr -f md_noPBC.xtc -o rmsd_protein.xvg -tu ns 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Ligand RMSD (after fitting to protein backbone — critical!)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;4 13&amp;#39;&lt;/span&gt; | gmx rms -s md.tpr -f md_noPBC.xtc -o rmsd_ligand.xvg -tu ns 
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We typically want to assess both protein and ligand (with respect to protein) rmsd. We essentially want to see a stable protein with rmsd plateauing, some said &amp;lt;2-3 Å ? 🤷‍♂️ Same goes with the ligand rmsd.&lt;/p&gt;
&lt;p&gt;If we see large numbers on the ligand as time passes, that means there is instability or dissociation. Let&amp;rsquo;s take a look at our result!&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(tidyverse)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(xvm)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;prot &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_xvg&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rmsd_protein.xvg&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ligand &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_xvg&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rmsd_ligand.xvg&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_xvg&lt;/span&gt;(prot)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim/index_files/figure-html/unnamed-chunk-19-1.png&#34; width=&#34;672&#34; /&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_xvg&lt;/span&gt;(ligand)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim/index_files/figure-html/unnamed-chunk-19-2.png&#34; width=&#34;672&#34; /&gt;
&lt;/details&gt;
&lt;p&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim/index_files/figure-html/unnamed-chunk-20-1.png&#34; width=&#34;672&#34; /&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim/index_files/figure-html/unnamed-chunk-20-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Backbone RMSD rises from 0 to ~0.5 nm within the first ~5 ns and plateaus around 0.5–0.6 nm for the remainder of the 50 ns simulation, indicating the protein equilibrates early and maintains a stable conformation throughout — consistent across all simulation lengths tested. The ligand (UNL) RMSD tells a different story: it fluctuates around 0.2–0.4 nm for the first ~20 ns, then drifts progressively upward to ~0.5–0.7 nm after ~25 ns with no plateau, suggesting the ligand undergoes a conformational transition away from its original docked pose mid-simulation. However, this should be interpreted alongside the minimum distance, COM distance, H-bond, and interaction energy data.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: I had to use echo &amp;ldquo;1 0&amp;rdquo; | gmx trjconv -s md.tpr -f md.xtc -o md_noPBC.xtc -pbc mol -center to re-center because there were some outliers of the RMSD of the ligand with 12-15 values. This disappeared after re-centering&lt;/p&gt;
&lt;/blockquote&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: I also had to extend 10ns to 50ns to further assess if protein rmsd plateaud as it kept rising. To extend use this:&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx convert-tpr -s md.tpr -extend &lt;span style=&#34;color:#099&#34;&gt;40000&lt;/span&gt; -o md_extended.tpr
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx mdrun -deffnm md_extended &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -s md_extended.tpr &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -cpi md.cpt &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -nb gpu -pme gpu -bonded gpu -update gpu &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -ntmpi &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; -ntomp &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&lt;/span&gt;  -v
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3 id=&#34;rmsf&#34;&gt;RMSF — Root Mean Square Fluctuation
  &lt;a href=&#34;#rmsf&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;What it measures: Per-residue (or per-atom) flexibility over time.&lt;/p&gt;
&lt;p&gt;Identifies which regions of the protein and which atoms of the ligand are mobile. High flexibility at the binding site residues suggests the ligand isn&amp;rsquo;t holding them in place as expected.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Per-residue protein flexibility&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt; | gmx rmsf -s md.tpr -f md_noPBC.xtc -o rmsf_protein.xvg -res 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Per-atom ligand flexibility&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;13&lt;/span&gt; | gmx rmsf -s md.tpr -f md_noPBC.xtc -o rmsf_ligand.xvg 
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Binding site residues with low RMSF (&amp;lt; 1–1.5 Å) = well-restrained by ligand. If RMSF of active site residues is HIGH while your ligand RMSD is also high → ligand is not stabilizing the pocket. What does our say?&lt;/p&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim/index_files/figure-html/unnamed-chunk-23-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;A few minor peaks around residues 200, 300, and 580–600 suggest localized loop flexibility mid-structure but nothing dramatic. Overall the RMSF profile is consistent with a stable, well-folded protein throughout the 50 ns simulation, with flexibility confined primarily to the N-terminal region as expected for a large multimeric protein like PBP2x.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;rmsf_ligand &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_xvg&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rmsf_ligand.xvg&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_xvg&lt;/span&gt;(rmsf_ligand)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim/index_files/figure-html/unnamed-chunk-24-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Ligand RMSF analysis shows per-atom fluctuations ranging from ~0.02–0.27 nm across all 41 atoms (system atoms 10041–10082), with a notable peak around atom 10068 corresponding to H7 on the aromatic ring region (benzene ring), suggesting this part of the ligand is the most dynamically flexible. Overall fluctuation levels are moderate and consistent with the dynamic pose sampling behavior observed throughout the simulation.&lt;/p&gt;
&lt;p&gt;Now let&amp;rsquo;s visualize with &lt;code&gt;ChimeraX&lt;/code&gt; and use color pallete of the temperature factor (bfactor) via &lt;code&gt;gmx rmsf -s md.tpr -f md_noPBC.xtc -o rmsf.xvg -oq rmsf_bfactor.pdb -res&lt;/code&gt;. Blue means minimal wiggliness, white is middle, red is lots of wiggliness.&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;rmsf_bfactor.png&#34; alt=&#34;image&#34; width=&#34;100%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;And we can see that (boxed by green color), our ligand rest quite comfortably in the pocket of prior cefepime with not a whole lot of wiggliness indicating not a whole lot of conformational change around the attached site.&lt;/p&gt;




&lt;h3 id=&#34;gyration&#34;&gt;Radius of Gyration (Rg)
  &lt;a href=&#34;#gyration&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;What it measures: Compactness of the protein.&lt;/p&gt;
&lt;p&gt;If the protein unfolds or the binding pocket opens/closes dramatically due to ligand binding/unbinding, Rg will change. It&amp;rsquo;s a global indicator of structural integrity.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;1&amp;#39;&lt;/span&gt; | gmx gyrate -s md.tpr -f md_noPBC.xtc -o gyrate.xvg 
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim/index_files/figure-html/unnamed-chunk-26-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;According to Claude, the radius of gyration plot for the 50 ns simulation shows total Rg remaining stable at ~2.9 nm throughout, confirming the protein maintains its overall fold with no unfolding or collapse. RgX stays relatively flat around ~2.75 nm. However RgY and RgZ show dramatic and unusual fluctuations between ~20,000–45,000 ps — RgY drops sharply from ~2.75 nm down to ~2.3 nm then recovers, while RgZ spikes from ~1.8 nm up to ~2.45 nm then returns to baseline. These large axial swings occurring simultaneously in opposite directions are concerning and could indicate significant domain reorganization, partial chain separation, or a PBC artifact requiring further investigation. Will visualize in ChimeraX to determine the cause.&lt;/p&gt;
&lt;p&gt;In ChimeraX, load &lt;code&gt;fullsystem.gro&lt;/code&gt; then load &lt;code&gt;md_noPBC.xtc&lt;/code&gt; and enter command below. Then move to around 23000-30000 and see if it&amp;rsquo;s just a rotation.&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;hide solvent
hide :NA
hide :CL
cartoon
color bychain
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;img src=&#34;pbp2x_pcn_gyration.gif&#34; alt=&#34;&#34;&gt;
Alright, it&amp;rsquo;s just a rotation! The RgY and RgZ fluctuations between ~20–45 ns are attributable to protein rotation/tumbling in the simulation box rather than genuine conformational change, as confirmed by visual inspection of the trajectory in ChimeraX. Total Rg remains flat throughout, and the axial variations reflect reorientation rather than structural instability. Overall the protein fold is well-maintained across the full 50 ns simulation&lt;/p&gt;




&lt;h3 id=&#34;interaction&#34;&gt;Ligand-Protein Interaction Energy
  &lt;a href=&#34;#interaction&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;What it measures: Short-range Lennard-Jones (van der Waals) and Coulomb (electrostatic) energies between ligand and protein.&lt;/p&gt;
&lt;p&gt;The most direct thermodynamic indicator. If the ligand is truly bound, the interaction energy should be negative (favorable) and stable. If it dissociates, the energy approaches 0.&lt;/p&gt;
&lt;p&gt;Add this to &lt;code&gt;md.mdp&lt;/code&gt;&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;energygrps = Protein UNL ; the UNL is your ligand, use whichever it&amp;#39;s listed 
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;then rerun:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx grompp -f md.mdp -c md.gro -p topol.top -n index.ndx -o rerun.tpr
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx mdrun -s rerun.tpr -rerun md_noPBC.xtc -e rerun.edr -ntmpi &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; -ntomp &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx energy -f md.edr -o interaction_energy.xvg 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# then enter these&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Coul-SR:Protein-LIG
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;LJ-SR:Protein-LIG
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;interaction &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_xvg&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;interaction_energy.xvg&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plot_xvg&lt;/span&gt;(interaction)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim/index_files/figure-html/unnamed-chunk-29-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Interaction energy plots show LJ/vdW interaction is more stable and consistently negative (~-80 to -130 kJ/mol) throughout the simulation, while Coulomb interaction is highly variable (0 to -80 kJ/mol) with frequent excursions toward zero, indicating the ligand maintains physical contact with the protein but repeatedly loses and regains specific electrostatic interactions. This pattern is consistent with dynamic surface sampling rather than stable binding mode occupancy.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: For assessing interaction energy, we will have to rerun mdrun, only uses core, no gpu. But won&amp;rsquo;t take too long even for 50ns. I would do this after hbond. I tried to doing this together from the first production but it won&amp;rsquo;t let me.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3 id=&#34;hbond&#34;&gt;Hydrogen Bond Analysis
  &lt;a href=&#34;#hbond&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;What it measures: Number and occupancy of H-bonds between ligand and protein over the trajectory.&lt;/p&gt;
&lt;p&gt;Critical for beta-lactam/PBP interactions — the acylation mechanism involves specific H-bonds. Persistent H-bonds = meaningful interaction.&lt;/p&gt;
&lt;p&gt;more info than pdb during acpype conversion, so maybe try that out as well.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;1 13&amp;#39;&lt;/span&gt; | gmx hbond -s md.tpr -f md_noPBC.xtc -num hbond_ligand_protein.xvg -tu ns 
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim/index_files/figure-html/unnamed-chunk-31-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Ligand-protein H-bond analysis reveals 1–3 persistent H-bonds throughout the simulation with only rare drops to zero, indicating the ligand maintains contact with the protein but does not settle into a single stable binding pose — consistent with the progressive RMSD drift. Rather than full dissociation, the ligand appears to be dynamically sampling multiple interaction geometries. This suggests weak or non-specific binding at this site, and a more favorable docking pose or alternative binding site may need to be explored.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: Try simulating ligand-protein complex where you know it won&amp;rsquo;t bind and see this hydrogen bonds to be 0 0 0 0 all the way. Quite often, when i mistakenly docked in the wrong coordinates, my first few minutes production has very low hydrogen bond average. Usually in those situation, I&amp;rsquo;ll stop instead of continuing. What do you guys do?&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3 id=&#34;distance&#34;&gt;Binding Pocket Distance Analysis
  &lt;a href=&#34;#distance&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;What it measures: Distance between the ligand center-of-mass (or key atoms) and key binding site residues.&lt;/p&gt;
&lt;p&gt;RMSD can be misleading if the ligand rotates in place. Tracking distances to known catalytic residues (e.g., active site Ser, Lys, Thr in PBPs) gives mechanistically interpretable data.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;echo&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;1 13&amp;#39;&lt;/span&gt; | gmx mindist -f md_noPBC.xtc -s md.tpr -n index.ndx -od mindist_prot_lig.xvg -on numcont_prot_lig.xvg -d 0.4
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx distance -f md_noPBC.xtc -s md.tpr -n index.ndx -select &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;com of group &amp;#34;Protein&amp;#34; plus com of group &amp;#34;UNL&amp;#34;&amp;#39;&lt;/span&gt; -oall distance_prot_lig.xvg
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mindist &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_lines&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;mindist_prot_lig.xvg&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mindist_list &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mindist[&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(mindist, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;^@|^#&amp;#34;&lt;/span&gt;)] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_split&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mindist_val &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mindist_list))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mindist_list)) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  time[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mindist_list[[i]][1]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mindist_val[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mindist_list[[i]][3]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mindist_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; time, mindist_val &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; mindist_val) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;(time),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         mindist_val &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;(mindist_val))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot_mindist &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mindist_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggplot&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;time,y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;mindist_val)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_point&lt;/span&gt;(color &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;red&amp;#34;&lt;/span&gt;, alpha&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0.7&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme_bw&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;dist &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_lines&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;distance_prot_lig.xvg&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;dist_list &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; dist[&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(dist, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;^@|^#&amp;#34;&lt;/span&gt;)] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_trim&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_split&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;    &amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; dist_val &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(dist_list))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(dist_list)) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  time[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; dist_list[[i]][1] 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  dist_val[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; dist_list[[i]][2]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;dist_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; time, dist_val &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; dist_val) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;(time),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         dist_val &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;(dist_val))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot_dist &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; dist_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggplot&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;time,y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;dist_val)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_point&lt;/span&gt;(color &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;red&amp;#34;&lt;/span&gt;, alpha&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0.7&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme_bw&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim/index_files/figure-html/unnamed-chunk-34-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Minimum distance analysis between the ligand and protein confirms the ligand remains in close proximity (~0.15–0.25 nm) throughout the entire 50 ns simulation with no upward drift, ruling out full dissociation. Together with the persistent 1–3 protein-ligand H-bonds, this suggests the ligand maintains continuous contact with the protein surface but explores multiple binding geometries rather than converging on a single stable pose — consistent with weak or dynamic binding at this site.&lt;/p&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/mdsim/index_files/figure-html/unnamed-chunk-35-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Center of mass distance between ligand and protein remains stable at ~2.75–2.95 nm throughout the 50 ns simulation with no directional drift, confirming the ligand does not dissociate from the protein. Combined with the minimum distance (~0.15–0.25 nm) and persistent 1–3 H-bonds, the collective picture is one of a ligand that remains associated with the protein but dynamically samples multiple surface poses rather than adopting a single locked binding mode — suggesting the binding interaction is real but relatively weak or non-specific at this site.&lt;/p&gt;




&lt;h3 id=&#34;visual&#34;&gt;Visual Inspection
  &lt;a href=&#34;#visual&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;&lt;img src=&#34;pbp2x_pcn_viz.gif&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;🙌 These trajectory is from the last few steps of the 50ns. looks like it&amp;rsquo;s still at the same place! Fantastic!&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s plot our frame 0 and frame 5000 and see where our ligand is.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Extract full system at time 0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx trjconv -s md.tpr -f md_noPBC.xtc -o frame0.pdb -dump &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Select System&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Extract full system at time 50000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gmx trjconv -s md.tpr -f md_noPBC.xtc -o frame50000.pdb -dump &lt;span style=&#34;color:#099&#34;&gt;50000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Select System&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In ChimeraX, open both, then:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;hide solvent
hide protein
hide :NA
hide :CL
cartoon
matchmaker #2 to #1 pairing bb
color #1:UNL red
color #2:UNL green
transparency 80 target c
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The matchmaker aligns both frames.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;pbp2x_pcn_fram0_fram5000_viz.gif&#34; alt=&#34;&#34;&gt;
Great! Looks like it the coordinates are quite similar! Especially the beta lactam ring area, which is the most important part for the interaction. The tail area seems to have more wiggle room, which is consistent with our RMSF analysis.&lt;/p&gt;
&lt;p&gt;Wow, I learnt so much, and so much more to learn! The physics is really fascinating! I don&amp;rsquo;t understanding all of the them, but at least this is a step to knowing what I don&amp;rsquo;t know, which is a lot! The trials and erros we&amp;rsquo;ve gone through, and thanks to Claude, I was able to get unstuck! But still, took quite a long time from experimentation, reproducing with different pdbs, then the documentation! My heart goes to those who do this for a living&amp;hellip; it&amp;rsquo;s not easy! ❤️ Thanks to all who contribute to the scientific community! All the software I used are freely available! And you can reproduce this at your own home as well! Please be sure to cite them!&lt;/p&gt;




&lt;h2 id=&#34;opportunities&#34;&gt;Opportunities for Improvement
  &lt;a href=&#34;#opportunities&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Try out protein structures from 
&lt;a href=&#34;https://alphafold.com/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;AlphaFold Protein Structure Database&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Try out ColabFold&lt;/li&gt;
&lt;li&gt;Incorporate 
&lt;a href=&#34;https://github.com/gcorso/DiffDock&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;diffdock&lt;/a&gt; in discovering binding poses, compare it with 
&lt;a href=&#34;https://github.com/Discngine/fpocket&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;fpocket&lt;/a&gt;, and perhaps also 
&lt;a href=&#34;https://autodock-vina.readthedocs.io/en/latest/index.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;vina&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;I was told that using mol2 directly has more info than pdb during acpype&lt;/li&gt;
&lt;li&gt;we did not assess covalent docking&lt;/li&gt;
&lt;li&gt;I need to turn all of the above into a pipeline where there is less copy and paste and typing! So I can run more!&lt;/li&gt;
&lt;li&gt;I am really curious about how certain ESBL beta lactamase has affinity to beta lactamase inhibitor. Want to see if gromacs can simulate this!&lt;/li&gt;
&lt;li&gt;I should really be using index in gromacs&lt;/li&gt;
&lt;li&gt;learn about MM-GBSA / MM-PBSA for post-processing&lt;/li&gt;
&lt;li&gt;learn about protonation (physiological pH), probably using obabel&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;lessons&#34;&gt;Lessons Learnt
  &lt;a href=&#34;#lessons&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;learn that you can use &lt;code&gt;echo 15 | gmx something&lt;/code&gt; to insert &lt;code&gt;15&lt;/code&gt; onto the next command response. Or &lt;code&gt;gmx something &amp;lt;&amp;lt; EOF 15 EOF&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;learnt some basic gromacs workflow, interpretation, assessment&lt;/li&gt;
&lt;li&gt;learnt autodock vina and diffdock&lt;/li&gt;
&lt;li&gt;learnt to visualize with ChimeraX&lt;/li&gt;
&lt;li&gt;for GTX 1080, average around 40-50ns/day&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you like this article:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;please feel free to send me a 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;comment or visit my other blogs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;please feel free to follow me on 
&lt;a href=&#34;https://bsky.app/profile/kenkoonwong.bsky.social&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;BlueSky&lt;/a&gt;, 
&lt;a href=&#34;https://twitter.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;twitter&lt;/a&gt;, 
&lt;a href=&#34;https://github.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GitHub&lt;/a&gt; or 
&lt;a href=&#34;https://med-mastodon.com/@kenkoonwong&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Mastodon&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;if you would like collaborate please feel free to 
&lt;a href=&#34;https://www.kenkoonwong.com/contact/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;contact me&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>Setting Up A Cluster of Tiny PCs For Parallel Computing - A Note To Myself</title>
      <link>https://www.kenkoonwong.com/blog/parallel-computing/</link>
      <pubDate>Fri, 16 Jan 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.kenkoonwong.com/blog/parallel-computing/</guid>
      <description>&lt;script src=&#34;https://www.kenkoonwong.com/blog/parallel-computing/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/parallel-computing/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;https://www.kenkoonwong.com/blog/parallel-computing/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/parallel-computing/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;https://www.kenkoonwong.com/blog/parallel-computing/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/parallel-computing/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;blockquote&gt;
&lt;p&gt;Enjoyed learning the process of setting up a cluster of tiny PCs for parallel computing. A note to myself on installing Ubuntu, passwordless SSH, automating package installation across nodes, distributing R simulations, and comparing CV5 vs CV10 performance. Fun project!&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2 id=&#34;motivations&#34;&gt;Motivations
  &lt;a href=&#34;#motivations&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Part of something I want to learn this year is getting a little more into parallel computing. How we can distribute simulation computations across different devices. Lately, we have more reasons to do this because quite a few of our simulations require long running computation and leaving my laptop running overnight or several days is just not a good use it. We have also tried cloud computing as well and without knowing how those distributed cores are, well, distributed, it&amp;rsquo;s hard for me to conceptualize how these are done and what else we could optimize. Hence, what is a better way of doing it on our own! Sit tight, this is going to be a bumpy one. Let&amp;rsquo;s go!&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;parallel.jpg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;




&lt;h2 id=&#34;objectives&#34;&gt;Objectives
  &lt;a href=&#34;#objectives&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#shop&#34;&gt;Which PCs to get?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;install&#34;&gt;Install Ubuntu&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;network&#34;&gt;Align and fix IPs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#passwordless&#34;&gt;Passwordless ssh&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#commands&#34;&gt;Send multiple commands via ssh&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#r&#34;&gt;Install R&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#template&#34;&gt;Create A Template R script For Simulation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#install-packages&#34;&gt;Install Packages On All Nodes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#upload&#34;&gt;Upload Rscript to Nodes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#script&#34;&gt;Run Rscript&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#extract&#34;&gt;Extract data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#compare&#34;&gt;Compare time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#opportunity&#34;&gt;Opportunities for improvement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#lessons&#34;&gt;Lessons learnt&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;shop&#34;&gt;Which PCs to Get?
  &lt;a href=&#34;#shop&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;https://p1-ofp.static.pub/medias/bWFzdGVyfHJvb3R8MTU0MzM5fGltYWdlL3BuZ3xoYzIvaDhhLzk4NDc0MDE3NDIzNjYucG5nfGI1ODRkYjMyY2JmYmIyODJiOWM1YTI1NzhjODBlOWNkYjJlYjgwMDMxMWE1ZTUzZDA1M2YwNDNlNWUxNDM4NmQ/lenovo-thinkcentre-m715-refresh-hero.png?width=400&amp;height=400&#34; alt=&#34;image&#34; width=&#34;40%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;Preferably something functional and cheap! Something like a used Lenovo M715q Tiny PCs or something similar.&lt;/p&gt;




&lt;h2 id=&#34;install&#34;&gt;Install Ubuntu
  &lt;a href=&#34;#install&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;https://upload.wikimedia.org/wikipedia/commons/thumb/7/76/Ubuntu-logo-2022.svg/500px-Ubuntu-logo-2022.svg.png&#34; alt=&#34;image&#34; width=&#34;50%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Download 
&lt;a href=&#34;https://ubuntu.com/download/server&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Ubuntu Server&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Create a bootable USB using 
&lt;a href=&#34;https://www.balena.io/etcher/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;balenaEtcher&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;When starting Lenovo up, press F12 continuously until it shows an option to boot from USB. If F12 does not work, reboot and press F1 to BIOS. Go to &lt;code&gt;Startup&lt;/code&gt; Tab, change CSM Support to &lt;code&gt;Enabled&lt;/code&gt;. Then set &lt;code&gt;Primary Boot Priority&lt;/code&gt; to &lt;code&gt;USB&lt;/code&gt; by moving priority to first. Then &lt;code&gt;F10&lt;/code&gt; to save configuration and exit. It will then reboot to USB.&lt;/li&gt;
&lt;li&gt;Make sure it&amp;rsquo;s connected to internet via LAN for smoother installation.&lt;/li&gt;
&lt;li&gt;Follow the instructions to install Ubuntu, setting username, password etc. Then reboot.&lt;/li&gt;
&lt;li&gt;Make sure to remove USB drive, if you didn&amp;rsquo;t it&amp;rsquo;ll remind you. Et voila!&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The installations were very quick, compared to the other OS I&amp;rsquo;ve installed in the past. Very smooth as well. I thoroughly enjoyed seeting these up.&lt;/p&gt;




&lt;h2 id=&#34;network&#34;&gt;Align and Fix IPs
  &lt;a href=&#34;#network&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;For organizational purpose, make sure you go to your router setting and set your computer clusters to convenient IPs such as 192.168.1.101, 192.168.1.102, 192.168.1.103 etc. You may have to reboot your computer clusters after changing it on your router.&lt;/p&gt;




&lt;h2 id=&#34;passwordless&#34;&gt;Passwordless SSH
  &lt;a href=&#34;#passwordless&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Next, you want to set up passwordless SSH. This is crucial for R to work!&lt;/p&gt;




&lt;h4 id=&#34;1-create-a-key&#34;&gt;1. Create a key
  &lt;a href=&#34;#1-create-a-key&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ssh-keygen -t ed25519
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4 id=&#34;2-send-copy-of-key-to-your-node&#34;&gt;2. Send Copy of Key To Your Node
  &lt;a href=&#34;#2-send-copy-of-key-to-your-node&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ssh-copy-id -i .ssh/my_key.pub username1@192.168.1.101 
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;it will prompt you to enter your password, then after that you won&amp;rsquo;t need a pssword to ssh in.&lt;/p&gt;




&lt;h3 id=&#34;passwordless-sudo&#34;&gt;Passwordless Sudo
  &lt;a href=&#34;#passwordless-sudo&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;This is optional. But if you&amp;rsquo;re like me, don&amp;rsquo;t want to repeat lots of typing on installation, and see if you can use bash or R to install packages, you&amp;rsquo;d need this.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ssh -t username2@192.168.1.102 &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;echo &amp;#34;$(whoami) ALL=(ALL) NOPASSWD: ALL&amp;#34; | sudo tee /etc/sudoers.d/$(whoami)&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;It would prompt you to enter your password. You would have to do this for all your nodes&lt;/p&gt;




&lt;h2 id=&#34;commands&#34;&gt;Send Multiple Commands Via SSH
  &lt;a href=&#34;#commands&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;




&lt;h3 id=&#34;r&#34;&gt;Install R
  &lt;a href=&#34;#r&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;for&lt;/span&gt; host in username1@192.168.1.101 username2@192.168.1.102 username3@192.168.1.103; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;do&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ssh -t &lt;span style=&#34;color:#008080&#34;&gt;$host&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;sudo apt update &amp;amp;&amp;amp; sudo apt install -y r-base r-base-dev&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;done&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This is basically installing R on all of our clusters one after another.&lt;/p&gt;




&lt;h3 id=&#34;template&#34;&gt;Create A Template R script For Simulation
  &lt;a href=&#34;#template&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Why do we do this? We want to take advantage of the &lt;code&gt;multicore&lt;/code&gt; of each nodes as opposed to using &lt;code&gt;clusters&lt;/code&gt; on &lt;code&gt;future&lt;/code&gt; as the overhead network may add on to the time and makes optimization less efficiency. Instead, we will send a script to each node so that it can fork its own cores to run the simulation. Also, if we specify packages on our script, we can automate the process of installing these packages on our nodes.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(future)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(future.apply)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(dplyr)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(SuperLearner)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(ranger)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(xgboost)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(glmnet)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plan&lt;/span&gt;(multicore, workers &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W4 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# TRUE propensity score model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-0.5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.8&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# TRUE outcome model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Calculate TRUE ATE&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;logit_Y1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;logit_Y0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y1_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(logit_Y1)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y0_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(logit_Y0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(Y1_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; Y0_true)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W1, W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W2, W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W3, W4 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W4, A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; A, Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; Y)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;tune &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ntrees &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;500&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;),           
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  max_depth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;7&lt;/span&gt;),                  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  shrinkage &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;0.001&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;0.01&lt;/span&gt;)    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;tune2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ntrees &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;250&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;500&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  max_depth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;7&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;9&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  shrinkage &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;0.001&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;0.005&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;0.01&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;learners &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;create.Learner&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.xgboost&amp;#34;&lt;/span&gt;, tune &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; tune, detailed_names &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;, name_prefix &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgb&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;learners2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;create.Learner&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.xgboost&amp;#34;&lt;/span&gt;, tune &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; tune2, detailed_names &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;, name_prefix &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgb&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Super Learner library &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;SL_library &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.xgboost&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.ranger&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.glm&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.mean&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.xgboost&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.ranger&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.xgboost&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.glm&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.ranger&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.xgboost&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;screen.glmnet&amp;#34;&lt;/span&gt;)),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.glmnet&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.glm&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.ranger&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.glm&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(learners&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;names, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.glm&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(learners&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;names, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.glmnet&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.gam&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.glm&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(learners2&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;names, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.glm&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# sample&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;allnum &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; START&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;END
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n_sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(allnum)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n_i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;6000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Function to run one TMLE iteration&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;run_tmle_iteration &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(seed_val, df, n_i, SL_library) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(seed_val)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(df, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_i, replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(Y, A, W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W4)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Prepare data&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  X_outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(A, W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W4) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.data.frame&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  X_treatment &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W4) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.data.frame&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  Y_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;Y
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  A_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;A
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Outcome model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  SL_outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;SuperLearner&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; Y_vec,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; X_outcome,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;binomial&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    SL.library &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; SL_library,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    cvControl &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(V &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  )
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Initial predictions&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(SL_outcome, newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; X_outcome)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;pred
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Predict under treatment A=1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  X_outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; X_outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(SL_outcome, newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; X_outcome_1)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;pred
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Predict under treatment A=0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  X_outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; X_outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(SL_outcome, newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; X_outcome_0)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;pred
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Bound outcome predictions to avoid qlogis issues&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmax&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmin&lt;/span&gt;(outcome, &lt;span style=&#34;color:#099&#34;&gt;0.9999&lt;/span&gt;), &lt;span style=&#34;color:#099&#34;&gt;0.0001&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmax&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmin&lt;/span&gt;(outcome_1, &lt;span style=&#34;color:#099&#34;&gt;0.9999&lt;/span&gt;), &lt;span style=&#34;color:#099&#34;&gt;0.0001&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmax&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmin&lt;/span&gt;(outcome_0, &lt;span style=&#34;color:#099&#34;&gt;0.9999&lt;/span&gt;), &lt;span style=&#34;color:#099&#34;&gt;0.0001&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Treatment model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  SL_treatment &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;SuperLearner&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; A_vec,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; X_treatment,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;binomial&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    SL.library &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; SL_library,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    cvControl &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(V &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  )
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Propensity scores&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ps &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(SL_treatment, newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; X_treatment)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;pred
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Truncate propensity scores &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ps_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmax&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmin&lt;/span&gt;(ps, &lt;span style=&#34;color:#099&#34;&gt;0.95&lt;/span&gt;), &lt;span style=&#34;color:#099&#34;&gt;0.05&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Calculate clever covariates&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  a_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;ps_final
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  a_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; ps_final)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  clever_covariate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(A_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;ps_final, &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; ps_final))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  epsilon_model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(Y_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;offset&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; clever_covariate, 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                       family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  epsilon &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;coef&lt;/span&gt;(epsilon_model)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  updated_outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome_1) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; epsilon &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt; a_1)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  updated_outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome_0) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; epsilon &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt; a_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Calc ATE&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(updated_outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; updated_outcome_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Calc SE&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  updated_outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(A_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, updated_outcome_1, updated_outcome_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sqrt&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;var&lt;/span&gt;((Y_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; updated_outcome) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt; clever_covariate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                   updated_outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; updated_outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; ate) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt; n_i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; ate, se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; se))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Run iterations in parallel&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(num in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(SL_library)) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(num &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;9&lt;/span&gt;)) { next }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;cat&lt;/span&gt;(num)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;cat&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;TMLE iterations in parallel with 4 workers (multicore)...\n&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  start_time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;Sys.time&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  results_list &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;future_lapply&lt;/span&gt;(START&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;END, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(i) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    result &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;run_tmle_iteration&lt;/span&gt;(i, df, n_i, SL_library[[num]])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%%&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;100&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;cat&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;Completed iteration:&amp;#34;&lt;/span&gt;, i, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;\n&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(result)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }, future.seed &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  end_time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;Sys.time&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  run_time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; end_time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; start_time
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Extract results&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  predicted_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sapply&lt;/span&gt;(results_list, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(x) x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;ate)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  pred_se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sapply&lt;/span&gt;(results_list, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(x) x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;se)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Results&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  results &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    iteration &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; START&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;END,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; predicted_ate,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; pred_se,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    ci_lower &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1.96&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt; se,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    ci_upper &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1.96&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt; se,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    covers_truth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; ci_lower &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; ci_upper
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  )
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Summary stats&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  summary_stats &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;true_ATE&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;mean_estimated_ATE&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;median_estimated_ATE&amp;#34;&lt;/span&gt;, 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;               &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;sd_estimates&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;mean_SE&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;coverage_probability&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;bias&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    value &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      true_ATE,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(predicted_ate),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;median&lt;/span&gt;(predicted_ate),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sd&lt;/span&gt;(predicted_ate),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(pred_se),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(results&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;covers_truth),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(predicted_ate) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; true_ATE
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  )
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Create output directory if it doesn&amp;#39;t exist&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dir.exists&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_results&amp;#34;&lt;/span&gt;)) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dir.create&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_results&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Save detailed results (all iterations)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;write.csv&lt;/span&gt;(results, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_results/tmle_iterations&amp;#34;&lt;/span&gt;,num,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;.csv&amp;#34;&lt;/span&gt;), row.names &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Save summary statistics&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;write.csv&lt;/span&gt;(summary_stats, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_results/tmle_summary&amp;#34;&lt;/span&gt;,num,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;.csv&amp;#34;&lt;/span&gt;), row.names &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Save simulation parameters&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  sim_params &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    parameter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;n_population&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;n_sample_iterations&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;n_bootstrap_size&amp;#34;&lt;/span&gt;, 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                  &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL_library&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;n_workers&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;runtime_seconds&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    value &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(n, n_sample, n_i, 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(SL_library[[num]], collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;, &amp;#34;&lt;/span&gt;), 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;(run_time, units &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;secs&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  )
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;write.csv&lt;/span&gt;(sim_params, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_results/simulation_parameters&amp;#34;&lt;/span&gt;,num,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;.csv&amp;#34;&lt;/span&gt;), row.names &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Save as RData for easy loading&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;save&lt;/span&gt;(results, summary_stats, sim_params, true_ATE, file &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_results/tmle_results&amp;#34;&lt;/span&gt;,num,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;.RData&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p&gt;What we did above is basically a template script (We are saving this as &lt;code&gt;par_test_script.R&lt;/code&gt;), one where we can edit where to begin and end in terms of which iteration to start and end per node. And also instruction to save result. This is when we can put a little more effort in incorporating some instructions to inform us when task is completed (e.g., via email) and also it would also be nice to know what is the ETA of the entire task by perhaps benchmarking how long the first iteration took to complete, then multiple by total iters per node. Again, this can be sent via email, and also maybe only for the first node as opposed to all nodes, so we&amp;rsquo;re not bombarded with messages beginning and the end. 🤣&lt;/p&gt;




&lt;h3 id=&#34;install-packages&#34;&gt;Install Packages On All Nodes
  &lt;a href=&#34;#install-packages&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## List all of our nodes&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;my_clusters &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;username1@192.168.1.101&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;username2@192.168.1.102&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;username3@192.168.1.103&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## Grab all of the packages needed on our script  &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;packages &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;gsub&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;library\\(([^)]+)\\)&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;\\1&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;grep&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;^library&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;readLines&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;par_test_script.R&amp;#34;&lt;/span&gt;),value &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## Create function to run sudo&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;remote_r_sudo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(host, r_code, intern &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  escaped &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;gsub&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;&amp;#34;&amp;#39;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;\\\\&amp;#34;&amp;#39;&lt;/span&gt;, r_code)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  cmd &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sprintf&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ssh %s &amp;#39;sudo Rscript -e \&amp;#34;%s\&amp;#34;&amp;#39;&amp;#34;&lt;/span&gt;, host, escaped)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(cmd, intern &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; intern)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## Loop over to install&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(cluster_i in my_clusters) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(cluster_i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(package in packages) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  command &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sprintf&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;if (!require(&amp;#34;%s&amp;#34;)) install.packages(&amp;#34;%s&amp;#34;)&amp;#39;&lt;/span&gt;, package, package)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;remote_r_sudo&lt;/span&gt;(cluster_i, command)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Make sure your computer doesn&amp;rsquo;t go to sleep with this. If this is the first time your nodes are installing these extensive libraries, it will take a while. Another way we can do this is to use &lt;code&gt;future_lapply&lt;/code&gt; for all nodes and also &lt;code&gt;tmux&lt;/code&gt; for all installations so that we don&amp;rsquo;t need to rely on our local workstation to be turned on to continue with the installation. See below on how we used &lt;code&gt;tmux&lt;/code&gt; to set and forget method.&lt;/p&gt;




&lt;h2 id=&#34;upload&#34;&gt;Upload Rscript to Nodes
  &lt;a href=&#34;#upload&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Alright, now we have installed the appropriate packages above, let&amp;rsquo;s upload scripts to our nodes.&lt;/p&gt;




&lt;h4 id=&#34;distribute-work&#34;&gt;Distribute Work
  &lt;a href=&#34;#distribute-work&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;num_list &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;clust_num &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;total_loop &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;div_iter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; total_loop&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;clust_num
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_iter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; total_loop &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#only use this for custom e.g., if one node did not work and it&amp;#39;s in charge of 300:500, we can put 500 for this and set first_iter as 300&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;first_iter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;last_iter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(div_iter,&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; first_iter
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;clust_num) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; clust_num) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    num_list[[i]] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(first_iter,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;:&amp;#34;&lt;/span&gt;,final_iter)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    next
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  num_list[[i]] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(first_iter,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;:&amp;#34;&lt;/span&gt;,last_iter)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  first_iter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(first_iter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; div_iter, &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  last_iter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(last_iter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; div_iter, &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;num_list
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [[1]]
## [1] &amp;#34;1:334&amp;#34;
## 
## [[2]]
## [1] &amp;#34;334:667&amp;#34;
## 
## [[3]]
## [1] &amp;#34;667:1000&amp;#34;
&lt;/code&gt;&lt;/pre&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(my_clusters)) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  username &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sub&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;@.*&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;,my_clusters[[i]])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sprintf&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;sed &amp;#39;s/START:END/%s/g&amp;#39; par_test_script.R &amp;gt; par_test_script1.R &amp;amp; scp par_test_script1.R %s:/home/%s/par_test_script1.R&amp;#34;&lt;/span&gt;,num_list[[i]],my_clusters[[i]],username))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We&amp;rsquo;ll iterate and insert the appropriate iters for each node and save it to &lt;code&gt;par_test_script1.R&lt;/code&gt;. Then upload to each nodes with the code above.&lt;/p&gt;




&lt;h4 id=&#34;check-setseed-in-multicore&#34;&gt;Check set.seed in multicore
  &lt;a href=&#34;#check-setseed-in-multicore&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(seed, df, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;6000&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(seed)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  df_sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n, .data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; df)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(df_sample)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;future_lapply&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;100&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(x) &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sample_df&lt;/span&gt;(seed&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;x,df&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;df))
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;When we did the above on local computer and also in terminal with multicore. It&amp;rsquo;s still the same! Woo hoo!&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;seed1.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;seed2.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;The interesting thing is I didn&amp;rsquo;t have to set &lt;code&gt;future.seed = T&lt;/code&gt; or &lt;code&gt;future.seed = some_number&lt;/code&gt; for this. However, if we put a number on future.seed, it will return the reproducible data! This is great, next time I&amp;rsquo;ll just use this seed and I don&amp;rsquo;t have to use &lt;code&gt;set.seed(i)&lt;/code&gt;. 🙌&lt;/p&gt;




&lt;h2 id=&#34;script&#34;&gt;Run Rscript
  &lt;a href=&#34;#script&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(my_clusters)) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# set your tmux new session name, here we call it &amp;#34;test&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  cluster_name &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;test&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# terminate any existing tmux with the existing name&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sprintf&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ssh %s &amp;#39;tmux kill-session -t %s 2&amp;gt;/dev/null || true&amp;#39;&amp;#34;&lt;/span&gt;, my_clusters[[i]], cluster_name))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# create new tmux session&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sprintf&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ssh %s &amp;#39;tmux new-session -d -s %s&amp;#39;&amp;#34;&lt;/span&gt;, my_clusters[[i]], cluster_name))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# run rscript in tmux&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sprintf&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ssh %s &amp;#39;tmux send-keys -t %s \&amp;#34;Rscript par_test_script1.R &amp;gt; result_%d.txt\&amp;#34;&amp;#39; ENTER&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                 my_clusters[[i]], cluster_name, i))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The code above is quite self-explanatory. Once the above code is run and completed, there we have it! it should be running in the background! 🙌 You can do a spot check and see if it&amp;rsquo;s actually running. Once completed, we&amp;rsquo;ll extract the data.&lt;/p&gt;




&lt;h2 id=&#34;extract&#34;&gt;Extract Data
  &lt;a href=&#34;#extract&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Since we have 10 combinations we want to assess, we will set nums as 1:10 and get our data! On your template script you can set however you want to save your data, and for extraction, just look for those and download them, read and merge! Or however you want to do it.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;nums &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;10&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(num in nums) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(num)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(my_clusters)) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  response &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sprintf&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;scp %s:tmle_results/simulation_parameters%d.csv simulation_parameters%d.csv&amp;#34;&lt;/span&gt;, my_clusters[[i]], num, num), intern &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(response &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;) { next }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  df_i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;simulation_parameters&amp;#34;&lt;/span&gt;,num,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;.csv&amp;#34;&lt;/span&gt;), show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt;) 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  sl_i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df_i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(parameter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL_library&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(value)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbind&lt;/span&gt;(df, df_i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(method &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sl_i, num &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; num))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df_sim_param &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(num in nums) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(my_clusters)) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  response &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sprintf&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;scp %s:tmle_results/tmle_iterations%d.csv tmle_iterations%d.csv&amp;#34;&lt;/span&gt;, my_clusters[[i]], num, num), intern &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(response &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;) { &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(my_clusters[[i]],&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; is missing num&amp;#34;&lt;/span&gt;, num)) ; next }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  df_i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_iterations&amp;#34;&lt;/span&gt;,num,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;.csv&amp;#34;&lt;/span&gt;), show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(num &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; num)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbind&lt;/span&gt;(df, df_i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df_iter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;blockquote&gt;
&lt;p&gt;Take note that sometimes you may encounter issues, if for some reason the node is unable to complete the task, you can identify it then redistribute those tasks to the entire computer cluster.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2 id=&#34;compare&#34;&gt;Compare Time
  &lt;a href=&#34;#compare&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Let&amp;rsquo;s take at our compute time for 1 cluster, 3 cluster with 5-fold cv, 3 cluster with 10-fold cv.&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; method &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; hour_1clus_cv5 &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; hour_3clus_cv5 &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; hour_3clus_cv10 &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.xgboost, SL.ranger, SL.glm, SL.mean &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 4.02 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 1.4126466 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2.5179200 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.xgboost, SL.ranger &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 4.00 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 1.4136567 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2.5108584 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.xgboost, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.47 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.1680019 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.3034212 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.ranger, c(&#34;SL.xgboost&#34;, &#34;screen.glmnet&#34;) &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 4.23 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 1.4960542 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2.5165429 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.glmnet, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; NA &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.1074466 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.1995869 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.ranger, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; NA &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 1.2544446 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2.2254909 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgb_500_5_0.001, xgb_1000_5_0.001, xgb_500_7_0.001, xgb_1000_7_0.001, xgb_500_5_0.01, xgb_1000_5_0.01, xgb_500_7_0.01, xgb_1000_7_0.01, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 3.29 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 1.8059939 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 3.3030737 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgb_500_5_0.001, xgb_1000_5_0.001, xgb_500_7_0.001, xgb_1000_7_0.001, xgb_500_5_0.01, xgb_1000_5_0.01, xgb_500_7_0.01, xgb_1000_7_0.01, SL.glmnet &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; NA &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 1.8956873 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 3.4821903 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.gam, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; NA &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.1094693 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.2072266 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgb_250_3_0.001, xgb_500_3_0.001, xgb_1000_3_0.001, xgb_250_5_0.001, xgb_500_5_0.001, xgb_1000_5_0.001, xgb_250_7_0.001, xgb_500_7_0.001, xgb_1000_7_0.001, xgb_250_9_0.001, xgb_500_9_0.001, xgb_1000_9_0.001, xgb_250_3_0.005, xgb_500_3_0.005, xgb_1000_3_0.005, xgb_250_5_0.005, xgb_500_5_0.005, xgb_1000_5_0.005, xgb_250_7_0.005, xgb_500_7_0.005, xgb_1000_7_0.005, xgb_250_9_0.005, xgb_500_9_0.005, xgb_1000_9_0.005, xgb_250_3_0.01, xgb_500_3_0.01, xgb_1000_3_0.01, xgb_250_5_0.01, xgb_500_5_0.01, xgb_1000_5_0.01, xgb_250_7_0.01, xgb_500_7_0.01, xgb_1000_7_0.01, xgb_250_9_0.01, xgb_500_9_0.01, xgb_1000_9_0.01, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; NA &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; NA &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 4.6127172 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Looking at the time, we can definitely see the improvement of time from 1 cluster to 3 cluster. Take a look at our good old tuned xgboost and logistic regression, it took use previously for a quadcore 3.29 hours to complete, down to 1.8 hours. You&amp;rsquo;d imagine that if we use 3 pc&amp;rsquo;s as a cluster, we would notice improvement to ~1.1 hours, but I guess not for xgboost. Will have to investigate this. But if we look at xgboost + logistic regression without tuning, we went from 0.47 hours to 0.17 hours which made sense! Very interesting. Now if we up our CV to 10 fold, then we see that it took longer (makes senses), but still better than using 1 quadcore. I&amp;rsquo;ve heard people said that if you increase your K-fold CV, you reduce your bias, but increase variance. Let&amp;rsquo;s see if that&amp;rsquo;s true in our case here.&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; method &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; bias_3clus_cv5 &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; bias_3clus_cv10 &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; variance_3clus_cv5 &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; variance_3clus_cv10 &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.xgboost, SL.ranger, SL.glm, SL.mean &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0007695 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0007257 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001866 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001940 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.xgboost, SL.ranger &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0007677 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0007257 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001866 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001940 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.xgboost, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0010481 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001018 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001586 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001617 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.ranger, c(&#34;SL.xgboost&#34;, &#34;screen.glmnet&#34;) &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0008349 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0007257 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001868 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001940 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.glmnet, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0449075 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0449065 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001502 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001503 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.ranger, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0007695 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0007257 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001866 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001940 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgb_500_5_0.001, xgb_1000_5_0.001, xgb_500_7_0.001, xgb_1000_7_0.001, xgb_500_5_0.01, xgb_1000_5_0.01, xgb_500_7_0.01, xgb_1000_7_0.01, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0006449 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0010681 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001491 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001504 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgb_500_5_0.001, xgb_1000_5_0.001, xgb_500_7_0.001, xgb_1000_7_0.001, xgb_500_5_0.01, xgb_1000_5_0.01, xgb_500_7_0.01, xgb_1000_7_0.01, SL.glmnet &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0005986 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0010492 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001502 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001511 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.gam, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0062967 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0062967 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001537 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001537 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgb_250_3_0.001, xgb_500_3_0.001, xgb_1000_3_0.001, xgb_250_5_0.001, xgb_500_5_0.001, xgb_1000_5_0.001, xgb_250_7_0.001, xgb_500_7_0.001, xgb_1000_7_0.001, xgb_250_9_0.001, xgb_500_9_0.001, xgb_1000_9_0.001, xgb_250_3_0.005, xgb_500_3_0.005, xgb_1000_3_0.005, xgb_250_5_0.005, xgb_500_5_0.005, xgb_1000_5_0.005, xgb_250_7_0.005, xgb_500_7_0.005, xgb_1000_7_0.005, xgb_250_9_0.005, xgb_500_9_0.005, xgb_1000_9_0.005, xgb_250_3_0.01, xgb_500_3_0.01, xgb_1000_3_0.01, xgb_250_5_0.01, xgb_500_5_0.01, xgb_1000_5_0.01, xgb_250_7_0.01, xgb_500_7_0.01, xgb_1000_7_0.01, xgb_250_9_0.01, xgb_500_9_0.01, xgb_1000_9_0.01, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; NA &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0013250 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; NA &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001528 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Wow, not too shabby! Indeed when we went from cv5 to cv10, we have reduced bias and slightly increased variance! How about that. Everything except gam + lr, which make sense because we don&amp;rsquo;t really tune them. Though that being said, I wonder what&amp;rsquo;s under the hood that controls the knot for gam in superlearner. Will need to check that out. With this, it looks like tuned xgboost + lr might have the best numbers. Well, now we&amp;rsquo;ve seen bias and variance, what about coverage?&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; method &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; coverage_3clus_cv5 &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; coverage_3clus_cv10 &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.xgboost, SL.ranger, SL.glm, SL.mean &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.536 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.517 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.xgboost, SL.ranger &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.536 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.517 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.xgboost, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.811 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.799 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.ranger, c(&#34;SL.xgboost&#34;, &#34;screen.glmnet&#34;) &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.539 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.517 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.glmnet, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.051 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.052 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.ranger, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.536 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.517 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgb_500_5_0.001, xgb_1000_5_0.001, xgb_500_7_0.001, xgb_1000_7_0.001, xgb_500_5_0.01, xgb_1000_5_0.01, xgb_500_7_0.01, xgb_1000_7_0.01, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.882 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.878 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgb_500_5_0.001, xgb_1000_5_0.001, xgb_500_7_0.001, xgb_1000_7_0.001, xgb_500_5_0.01, xgb_1000_5_0.01, xgb_500_7_0.01, xgb_1000_7_0.01, SL.glmnet &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.881 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.876 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.gam, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.926 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.926 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgb_250_3_0.001, xgb_500_3_0.001, xgb_1000_3_0.001, xgb_250_5_0.001, xgb_500_5_0.001, xgb_1000_5_0.001, xgb_250_7_0.001, xgb_500_7_0.001, xgb_1000_7_0.001, xgb_250_9_0.001, xgb_500_9_0.001, xgb_1000_9_0.001, xgb_250_3_0.005, xgb_500_3_0.005, xgb_1000_3_0.005, xgb_250_5_0.005, xgb_500_5_0.005, xgb_1000_5_0.005, xgb_250_7_0.005, xgb_500_7_0.005, xgb_1000_7_0.005, xgb_250_9_0.005, xgb_500_9_0.005, xgb_1000_9_0.005, xgb_250_3_0.01, xgb_500_3_0.01, xgb_1000_3_0.01, xgb_250_5_0.01, xgb_500_5_0.01, xgb_1000_5_0.01, xgb_250_7_0.01, xgb_500_7_0.01, xgb_1000_7_0.01, xgb_250_9_0.01, xgb_500_9_0.01, xgb_1000_9_0.01, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; NA &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.844 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
as not expecting gam + lr to have so much coverage! But looking at bias from the previous table, it&#39;s actually quite horrible. So it seems like gam + lr is assymetrical in its estimates, sometimes overestimating, sometimes underestimating, leading to a wider confidence interval, hence more coverage. But that being said, it&#39;s not a good estimator because of its bias. Tuned xgboost + glmnet seems to be the best bet here with low bias, low variance and decent coverage.
Wow, I was not expecting gam + lr to have so much coverage! But looking at bias from the previous table, it&#39;s actually quite horrible. Let&#39;s visualize it!




&lt;h4 id=&#34;5-fold-cv&#34;&gt;5-fold CV
  &lt;a href=&#34;#5-fold-cv&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(tidyverse)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;num_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; sim_param_cv5_clus5 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(num, method)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df_coverage &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df_iter_cv5_clus3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;group_by&lt;/span&gt;(num) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;arrange&lt;/span&gt;(ate) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(iter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;row_number&lt;/span&gt;()) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(cover &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    covers_truth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;&lt;/span&gt; true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;right_missed&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    covers_truth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;&lt;/span&gt; true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;left_missed&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    covers_truth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;covered&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(num, cover) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;group_by&lt;/span&gt;(num, cover) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tally&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ungroup&lt;/span&gt;(cover) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(prop &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;100&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sum&lt;/span&gt;(n)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pivot_wider&lt;/span&gt;(id_cols &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; num, names_from &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;cover&amp;#34;&lt;/span&gt;, values_from &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prop&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(text &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;right missed: &amp;#34;&lt;/span&gt;,right_missed,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;% covered: &amp;#34;&lt;/span&gt;,covered,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;% left missed: &amp;#34;&lt;/span&gt;,left_missed,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;%&amp;#34;&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(num, text)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;method &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  num &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;9&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  method &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgb + rf + lr + mean&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgb + rf&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgb + lr&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rf + (xgb + preprocess w glmnet)&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;glmnet + lr&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rf + lr&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tuned xgb + lr&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tuned xgb + glmnet&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;gam + lr&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df_iter_cv5_clus3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;group_by&lt;/span&gt;(num) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;arrange&lt;/span&gt;(ate) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(iter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;row_number&lt;/span&gt;()) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(cover &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    covers_truth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;&lt;/span&gt; true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;right_missed&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    covers_truth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;&lt;/span&gt; true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;left_missed&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    covers_truth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;covered&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggplot&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;iter,y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;ate,color&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;cover)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_point&lt;/span&gt;(alpha&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_errorbar&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;iter,ymin&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;ci_lower,ymax&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;ci_upper), alpha&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_hline&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(yintercept&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0.0373518&lt;/span&gt;), color &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;blue&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_text&lt;/span&gt;(data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; df_coverage,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;500&lt;/span&gt;, label &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; text),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-0.05&lt;/span&gt;,  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            inherit.aes &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            size &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            hjust &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;scale_color_manual&lt;/span&gt;(values &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;covered&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;#619CFF&amp;#34;&lt;/span&gt;, 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                                  &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;left_missed&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;#F8766D&amp;#34;&lt;/span&gt;, 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                                  &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;right_missed&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;#00BA38&amp;#34;&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme_bw&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;facet_wrap&lt;/span&gt;(.~num, ncol &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;,labeller &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as_labeller&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;setNames&lt;/span&gt;(method&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;method, method&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;num))) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme&lt;/span&gt;(legend.position &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;bottom&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;br&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/parallel-computing/index_files/figure-html/unnamed-chunk-19-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;&lt;code&gt;lr&lt;/code&gt;: logistic regression, &lt;code&gt;xgb&lt;/code&gt;: xgboost, &lt;code&gt;rf&lt;/code&gt; : random forest, &lt;code&gt;gam&lt;/code&gt; : generalized additive model.&lt;/p&gt;
&lt;p&gt;Wow, look at gam + lr&amp;rsquo;s assymetrical coverage! This is so true then when we&amp;rsquo;re assessing, a point estimate of coverage is not adequate to assess the global usefulness of a method. We can see that this method is very bias indeed with asymmetrical tails. Since CV5 and CV10 do not have significant difference in coverage, we&amp;rsquo;ll skip the visualization.&lt;/p&gt;




&lt;h2 id=&#34;opportunities&#34;&gt;Opportunities for improvement
  &lt;a href=&#34;#opportunities&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;plenty of opportunities to turn our personal project into a package that will help us&lt;/li&gt;
&lt;li&gt;Use parallel computing on local to run system (such as installation) since this takes a lot of time&lt;/li&gt;
&lt;li&gt;Write function to let us know when tasks are completed&lt;/li&gt;
&lt;li&gt;Write function to estimate time of completion&lt;/li&gt;
&lt;li&gt;Write function to redistribute missing iterations&lt;/li&gt;
&lt;li&gt;learn openMPI&lt;/li&gt;
&lt;li&gt;make a package for the functions above so I can reuse in the future&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;lesson&#34;&gt;Lessons Learnt:
  &lt;a href=&#34;#lesson&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;used more &lt;code&gt;sprintf&lt;/code&gt; with this learning experience when using with system.&lt;/li&gt;
&lt;li&gt;learn that in &lt;code&gt;future_lapply&lt;/code&gt; in multicore &lt;code&gt;future.seed=100 or whatever number&lt;/code&gt; will help reproduce the same data&lt;/li&gt;
&lt;li&gt;Made a few pipeline to install packages on multiple nodes&lt;/li&gt;
&lt;li&gt;learnt set.seed in multicore works fine&lt;/li&gt;
&lt;li&gt;observed reduced bias with increase variance from cv5 to cv10&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you like this article:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;please feel free to send me a 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;comment or visit my other blogs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;please feel free to follow me on 
&lt;a href=&#34;https://bsky.app/profile/kenkoonwong.bsky.social&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;BlueSky&lt;/a&gt;, 
&lt;a href=&#34;https://twitter.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;twitter&lt;/a&gt;, 
&lt;a href=&#34;https://github.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GitHub&lt;/a&gt; or 
&lt;a href=&#34;https://med-mastodon.com/@kenkoonwong&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Mastodon&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;if you would like collaborate please feel free to 
&lt;a href=&#34;https://www.kenkoonwong.com/contact/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;contact me&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>Testing Super Learner&#39;s Coverage - A Note To Myself</title>
      <link>https://www.kenkoonwong.com/blog/superlearner-tmle/</link>
      <pubDate>Fri, 02 Jan 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.kenkoonwong.com/blog/superlearner-tmle/</guid>
      <description>&lt;script src=&#34;https://www.kenkoonwong.com/blog/superlearner-tmle/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/superlearner-tmle/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;blockquote&gt;
&lt;p&gt;Testing Super Learner with TMLE showed some interesting patterns 🤔 XGBoost + random forest only hit ~54% coverage, but tuned xgboost + GLM reached ~90%. Seems like pairing flexible learners with stable (even misspecified) models helps? Need to explore this more with different setups 📊&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src=&#34;superkids.jpg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;




&lt;h2 id=&#34;motivations&#34;&gt;Motivations:
  &lt;a href=&#34;#motivations&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;We have previously looked at 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/tmle/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;TMLE&lt;/a&gt; without Super Learner and observed the poor coverage of TMLE with unspecified xgboost when compared to a correctly specified logistic regression. Now since we&amp;rsquo;ve learnt 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/superlearner/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;NNLS&lt;/a&gt; the default method behind 
&lt;a href=&#34;https://cran.r-project.org/web/packages/SuperLearner/vignettes/Guide-to-SuperLearner.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Super Learner&lt;/a&gt;, why dont we test it out and take a look at the coverage! Let&amp;rsquo;s take a look at our previous coverage plot.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://www.kenkoonwong.com/blog/tmle/index_files/figure-html/unnamed-chunk-31-1.png&#34; alt=&#34;&#34;&gt;
We saw that with xgboost and TMLE, our coverage is about 71.1-87.6% and the lower and upper tails are asymmetrical. What if we use SuperLearner, can we improve these coverage?&lt;/p&gt;




&lt;h2 id=&#34;objectives&#34;&gt;Objectives
  &lt;a href=&#34;#objectives&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#code&#34;&gt;Let&amp;rsquo;s Code&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#result&#34;&gt;What&amp;rsquo;s The result?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#opportunity&#34;&gt;Opportunities for improvement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#lessons&#34;&gt;Lessons Learnt&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;code&#34;&gt;Let&amp;rsquo;s Code
  &lt;a href=&#34;#code&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;We will be using the same data generating mechanism as before, but this time we will be using Super Learner to estimate the propensity score and outcome regression.Let&amp;rsquo;s write a code where we can use &lt;code&gt;multicore&lt;/code&gt; and also a cheap 2nd hand lenovo with quad-core and Ubuntu to run and store results. Part of my learning goal for 2026 is to experiment more with parallel computing and see if we can explore mathematical platonic space more using simulation! Stay tuned on that, more to come on the simple parallel computing cluster set up!&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(dplyr)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(SuperLearner)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(future)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(future.apply)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Set up parallel processing&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plan&lt;/span&gt;(multicore, workers &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W4 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# TRUE propensity score model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-0.5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.8&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# TRUE outcome model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Calculate TRUE ATE&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;logit_Y1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;logit_Y0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y1_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(logit_Y1)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y0_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(logit_Y0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(Y1_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; Y0_true)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W1, W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W2, W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W3, W4 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W4, A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; A, Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; Y)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;tune &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ntrees &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;500&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;),           &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# More trees for better performance&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  max_depth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;),                    &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Deeper trees capture interactions&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  shrinkage &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;0.001&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;0.01&lt;/span&gt;)    &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Finer learning rate control&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;learners &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;create.Learner&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.xgboost&amp;#34;&lt;/span&gt;, tune &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; tune, detailed_names &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;, name_prefix &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgb&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Super Learner library &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;SL_library &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.xgboost&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.ranger&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.glm&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.mean&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                   &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.xgboost&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.ranger&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                   &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.xgboost&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.glm&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                   &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.ranger&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.xgboost&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;screen.glmnet&amp;#34;&lt;/span&gt;)),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                   &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(learners&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;names, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL.glm&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# sample&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n_sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n_i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;6000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Function to run one TMLE iteration&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;run_tmle_iteration &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(seed_val, df, n_i, SL_library) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(seed_val)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(df, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_i, replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(Y, A, W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W4)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Prepare data&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  X_outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(A, W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W4) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.data.frame&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  X_treatment &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W4) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.data.frame&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  Y_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;Y
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  A_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;A
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Outcome model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  SL_outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;SuperLearner&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; Y_vec,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; X_outcome,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;binomial&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    SL.library &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; SL_library,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    cvControl &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(V &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  )
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Initial predictions&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(SL_outcome, newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; X_outcome)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;pred
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Predict under treatment A=1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  X_outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; X_outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(SL_outcome, newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; X_outcome_1)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;pred
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Predict under treatment A=0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  X_outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; X_outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(SL_outcome, newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; X_outcome_0)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;pred
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Bound outcome predictions to avoid qlogis issues&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmax&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmin&lt;/span&gt;(outcome, &lt;span style=&#34;color:#099&#34;&gt;0.9999&lt;/span&gt;), &lt;span style=&#34;color:#099&#34;&gt;0.0001&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmax&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmin&lt;/span&gt;(outcome_1, &lt;span style=&#34;color:#099&#34;&gt;0.9999&lt;/span&gt;), &lt;span style=&#34;color:#099&#34;&gt;0.0001&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmax&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmin&lt;/span&gt;(outcome_0, &lt;span style=&#34;color:#099&#34;&gt;0.9999&lt;/span&gt;), &lt;span style=&#34;color:#099&#34;&gt;0.0001&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Treatment model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  SL_treatment &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;SuperLearner&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; A_vec,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; X_treatment,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;binomial&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    SL.library &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; SL_library,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    cvControl &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(V &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  )
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Propensity scores&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ps &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(SL_treatment, newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; X_treatment)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;pred
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Truncate propensity scores &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ps_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmax&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmin&lt;/span&gt;(ps, &lt;span style=&#34;color:#099&#34;&gt;0.95&lt;/span&gt;), &lt;span style=&#34;color:#099&#34;&gt;0.05&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Calculate clever covariates&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  a_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;ps_final
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  a_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; ps_final)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  clever_covariate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(A_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;ps_final, &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; ps_final))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  epsilon_model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(Y_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;offset&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; clever_covariate, 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                       family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  epsilon &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;coef&lt;/span&gt;(epsilon_model)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  updated_outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome_1) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; epsilon &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt; a_1)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  updated_outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome_0) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; epsilon &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt; a_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Calc ATE&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(updated_outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; updated_outcome_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Calc SE&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  updated_outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(A_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, updated_outcome_1, updated_outcome_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sqrt&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;var&lt;/span&gt;((Y_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; updated_outcome) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt; clever_covariate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                   updated_outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; updated_outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; ate) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt; n_i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; ate, se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; se))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Run iterations in parallel&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(num in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(SL_library)) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;cat&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;TMLE iterations in parallel with 4 workers (multicore)...\n&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;start_time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;Sys.time&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;results_list &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;future_lapply&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;n_sample, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(i) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  result &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;run_tmle_iteration&lt;/span&gt;(i, df, n_i, SL_library[[num]])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%%&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;100&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;cat&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;Completed iteration:&amp;#34;&lt;/span&gt;, i, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;\n&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(result)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}, future.seed &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;end_time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;Sys.time&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;run_time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; end_time &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; start_time
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Extract results&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predicted_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sapply&lt;/span&gt;(results_list, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(x) x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;ate)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;pred_se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sapply&lt;/span&gt;(results_list, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(x) x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;se)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Results&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;results &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  iteration &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;n_sample,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; predicted_ate,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; pred_se,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ci_lower &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1.96&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt; se,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ci_upper &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1.96&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt; se,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  covers_truth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; ci_lower &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; ci_upper
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Summary stats&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;summary_stats &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;true_ATE&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;mean_estimated_ATE&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;median_estimated_ATE&amp;#34;&lt;/span&gt;, 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;             &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;sd_estimates&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;mean_SE&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;coverage_probability&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;bias&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  value &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    true_ATE,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(predicted_ate),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;median&lt;/span&gt;(predicted_ate),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sd&lt;/span&gt;(predicted_ate),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(pred_se),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(results&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;covers_truth),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(predicted_ate) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; true_ATE
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  )
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Create output directory if it doesn&amp;#39;t exist&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dir.exists&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_results&amp;#34;&lt;/span&gt;)) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dir.create&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_results&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Save detailed results (all iterations)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;write.csv&lt;/span&gt;(results, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_results/tmle_iterations&amp;#34;&lt;/span&gt;,num,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;.csv&amp;#34;&lt;/span&gt;), row.names &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Save summary statistics&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;write.csv&lt;/span&gt;(summary_stats, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_results/tmle_summary&amp;#34;&lt;/span&gt;,num,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;.csv&amp;#34;&lt;/span&gt;), row.names &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Save simulation parameters&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sim_params &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  parameter &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;n_population&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;n_sample_iterations&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;n_bootstrap_size&amp;#34;&lt;/span&gt;, 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;SL_library&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;n_workers&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;runtime_seconds&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  value &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(n, n_sample, n_i, 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(SL_library[[num]], collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;, &amp;#34;&lt;/span&gt;), 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;(run_time, units &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;secs&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;write.csv&lt;/span&gt;(sim_params, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_results/simulation_parameters&amp;#34;&lt;/span&gt;,num,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;.csv&amp;#34;&lt;/span&gt;), row.names &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Save as RData for easy loading&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;save&lt;/span&gt;(results, summary_stats, sim_params, true_ATE, file &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_results/tmle_results&amp;#34;&lt;/span&gt;,num,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;.RData&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p&gt;As previously, we&amp;rsquo;ve set up our true ATE and simulation and will sample 60% of the N=10000 and ran a bootstrap of 1000. This time, instead of solo ML, we will use &lt;code&gt;Super Learner&lt;/code&gt; to ensemble several models and then estimate our ATE and if the 95% CI covers the true ATE. Below are the combinations we will investigate:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;xgboost + randomforest + glm + mean&lt;/li&gt;
&lt;li&gt;xgboost + randomforest&lt;/li&gt;
&lt;li&gt;xgboost + glm&lt;/li&gt;
&lt;li&gt;randomforest + (xgboost + screen.glmnet) + glm&lt;/li&gt;
&lt;li&gt;xgboost_tuned + glm&lt;/li&gt;
&lt;/ol&gt;




&lt;h2 id=&#34;results&#34;&gt;What&amp;rsquo;s The Result
  &lt;a href=&#34;#results&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/superlearner-tmle/index_files/figure-html/unnamed-chunk-3-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Wow, these are very interesting results! Looking at 1) where we have xgboost, randomforest, glm, and mean, our coverage was pretty horrible at ~50%. For 3) with xgboost and glm, coverage increased to ~80%, very similar to our solo tuned xgboost previously but at least with symmetrical non-coverage, very interesting. But when 2) we combined both xgboost and randomforest, the coverage dropped back to ~50%. Same thing happened when we used feature engineering 4) glmnet prior to xgboost, then couple with xgboost, coverage was only ~50%. But, when we 5) tune xgboost and combine it with glm, coverage increased to ~90%! I was not expecting this result but this is quite amazing! With regards to how long it took for each combination for a quadcore processor with multicore feature:&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; method &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; time_hours &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.xgboost, SL.ranger, SL.glm, SL.mean &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 4.02 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.xgboost, SL.ranger &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 4.00 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.xgboost, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.47 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; SL.ranger, SL.xgboost, screen.glmnet &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 4.23 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; tuned SL.xghboost, SL.glm &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 3.29 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;




&lt;h2 id=&#34;opportunity&#34;&gt;Opportunities for improvement
  &lt;a href=&#34;#opportunity&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;attempt using optimized hyperparameters for both xgboost and randomforest&lt;/li&gt;
&lt;li&gt;write a function for splitting multicore to several nodes&lt;/li&gt;
&lt;li&gt;try using 
&lt;a href=&#34;https://cran.r-project.org/web/packages/SuperLearner/vignettes/Guide-to-SuperLearner.html#xgboost-hyperparameter-exploration&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;NNloglik / maximize on auc&lt;/a&gt; instead of nnls, would it work better?&lt;/li&gt;
&lt;li&gt;need to test the theory that whether &lt;code&gt;set.seed&lt;/code&gt; works differently with multicore? Or does future automatically chooses &lt;code&gt;L&#39;Ecuyer-CMRG Algorithm&lt;/code&gt;? does our current method of &lt;code&gt;set.seed(i)&lt;/code&gt; of iteration number work ok?&lt;/li&gt;
&lt;li&gt;need to try &lt;code&gt;mcSuperLearner&lt;/code&gt;, built-in multicore&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;lessons&#34;&gt;Lessons Learnt
  &lt;a href=&#34;#lessons&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;learnt &lt;code&gt;future_lapply&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;learnt &lt;code&gt;multicore&lt;/code&gt; is so much faster!!! Though when we &lt;code&gt;future_lapply&lt;/code&gt; the function it appears to have the same time as &lt;code&gt;multisession&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;tested &lt;code&gt;Super Learner&lt;/code&gt;, it&amp;rsquo;s pretty cool! It already has options for parallel computing&lt;/li&gt;
&lt;li&gt;found out that mixing flexible model like xgboost (tuned) and logreg regularizes the ensembled model&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you like this article:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;please feel free to send me a 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;comment or visit my other blogs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;please feel free to follow me on 
&lt;a href=&#34;https://bsky.app/profile/kenkoonwong.bsky.social&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;BlueSky&lt;/a&gt;, 
&lt;a href=&#34;https://twitter.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;twitter&lt;/a&gt;, 
&lt;a href=&#34;https://github.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GitHub&lt;/a&gt; or 
&lt;a href=&#34;https://med-mastodon.com/@kenkoonwong&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Mastodon&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;if you would like collaborate please feel free to 
&lt;a href=&#34;https://www.kenkoonwong.com/contact/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;contact me&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>Happy New Year 2026</title>
      <link>https://www.kenkoonwong.com/blog/2026/</link>
      <pubDate>Thu, 01 Jan 2026 00:00:00 +0000</pubDate>
      
      <guid>https://www.kenkoonwong.com/blog/2026/</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;My 2026 New Year Resolutions. Reflect on 2025. Writing it down for 2026 to make it more accountable&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src=&#34;2026.jpg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;




&lt;h3 id=&#34;my-2025-new-year-resolutions&#34;&gt;My 2025 New Year Resolutions
  &lt;a href=&#34;#my-2025-new-year-resolutions&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Explore Gimp&lt;/li&gt;
&lt;li&gt;&lt;del&gt;Explore useful VIM key binding&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;&lt;del&gt;Learn nodejs, javascript, fasthtml, fashapi, plumber to create a simple tool for calling API and storing data&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;&lt;del&gt;Explore and implement LLM-assisted learning scenarios for eval score calibration (DE Level 3)&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;&lt;del&gt;re-create automation at wrk&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;&lt;del&gt;?sti dashboard&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;&lt;del&gt;Complete my neural network study. This time we&amp;rsquo;re getting close!&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;Practice 华语&lt;/li&gt;
&lt;li&gt;Dive into McElReath&amp;rsquo;s Statistical Rethinking 2nd time&lt;/li&gt;
&lt;li&gt;&lt;del&gt;Maintain coaching skill and practice more! ?reach out to CECM for 2&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;Finish 2 manuscripts&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src=&#34;reflect.jpg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;I did explore gimp a little bit but not enough to strike that off the list. We&amp;rsquo;ll see if I&amp;rsquo;ll resume that in 2026. One thing I found that often times it&amp;rsquo;s not the initial action that made me commit to learning something but usually it&amp;rsquo;s the second, third, or even fourth time with long period in between that brings me closer to understanding certain things. Or at least the illusion of understanding, that is. 🤣&lt;/p&gt;
&lt;p&gt;I did start using vim key binding. Thanks Alec for the motivation! I need to get better this year for sure! With regards to plumber API, we did explore that! Wrote 2 blogs on it too! 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/plumber/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;here&lt;/a&gt; and 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/qbank/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Quite satisfied with some of the completion in 2025, LLM-assisted scenarios, completed neural network study, re-created automation, sti dashboard, maintaing coaching skill, prepared 1 manuscript instead of 2. Let&amp;rsquo;s work on that in 2026!&lt;/p&gt;
&lt;p&gt;Something that I wasn&amp;rsquo;t expecting in 2025 was a year of diving head on into bioinformatics. Wow, I thought it was just 1 or 2 blogs but no, the various topics, the interesting math and methods behind it, I just couldn&amp;rsquo;t resist learning more and more of the surface of these topics! Very intriguing indeed! I did 8 blogs total on bioinformatics, and there are more that I want to dive into! Such as ampC, CRE, mec, vanA etc. Looking forward!&lt;/p&gt;
&lt;p&gt;Things that didn&amp;rsquo;t go as planned -&amp;gt; I started out ok with 华语 with duolingo but then got tapered off. Didn&amp;rsquo;t really dive into Richard McElreath&amp;rsquo;s statistical rethinking the second time. Did only 1 manuscript. Didn&amp;rsquo;t put too much effort on gimp. Maybe better effort this year!&lt;/p&gt;




&lt;h3 id=&#34;my-2026-new-year-resolutions&#34;&gt;My 2026 New Year Resolutions
  &lt;a href=&#34;#my-2026-new-year-resolutions&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Explore molecular docking &amp;amp; medication discovery&lt;/li&gt;
&lt;li&gt;Explore more bioinformatics (e.g ampC, CRE, mec, vanA etc), 59 skill&lt;/li&gt;
&lt;li&gt;Use more parallel computing (at least once a month with &lt;code&gt;future&lt;/code&gt; and personally built function for multicore)&lt;/li&gt;
&lt;li&gt;Build a computer cluster&lt;/li&gt;
&lt;li&gt;Explore simple quantum computing&lt;/li&gt;
&lt;li&gt;Use more tidymodels, keras and torch; I do think I&amp;rsquo;ll be exploring more AI/ML models in 2026 due to research&lt;/li&gt;
&lt;li&gt;Git, can&amp;rsquo;t get away from this&lt;/li&gt;
&lt;li&gt;Healthy life style&lt;/li&gt;
&lt;li&gt;Attend res conference at least 4x/month&lt;/li&gt;
&lt;li&gt;work research&lt;/li&gt;
&lt;li&gt;use more vim (at least cumulative 15 mins per week - try working on copy &amp;amp; pasting)&lt;/li&gt;
&lt;li&gt;Rewatch Richard McElreath&amp;rsquo;s Statistical Rethinking, especially on Gaussian Process and ordered logistic regression&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>My Messy Notes on Building a Super Learner: Peeking Under The Hood of NNLS</title>
      <link>https://www.kenkoonwong.com/blog/superlearner/</link>
      <pubDate>Sun, 21 Dec 2025 00:00:00 +0000</pubDate>
      
      <guid>https://www.kenkoonwong.com/blog/superlearner/</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;📚 Tried building Super Learner from scratch to understand what&amp;rsquo;s happening under the hood. Walked through the NNLS algorithm step-by-step—turns out ensembling models may beat solo models! Our homegrown version? Surprisingly close to nnls package results ❤️ But, does it really work in real life? 🤷‍♂️&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src=&#34;ensemble.jpg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;




&lt;h2 id=&#34;motivations&#34;&gt;Motivations
  &lt;a href=&#34;#motivations&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Previously we have learnt the workflow of 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/tmle/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;TMLE&lt;/a&gt; and most people would say to use it with Super Learner. But what is Super Learner? The name sounds fancy and cool. Let&amp;rsquo;s take a look under the hood of how super is this Super Learner. In this blog, we will see what non-negative least square is and what is the algorithm that is behind this method that fuels Super Learner. We&amp;rsquo;ll take a look at the mathematical procedures and then code from scratch and see if we can reproduce the result. Let&amp;rsquo;s do this!&lt;/p&gt;




&lt;h2 id=&#34;objectives&#34;&gt;Objectives:
  &lt;a href=&#34;#objectives&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#what&#34;&gt;What is Super Learner?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#engine&#34;&gt;What is the algorithm behind Super Learner?&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#nnls&#34;&gt;Non-negative Least Square&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#lha&#34;&gt;Lawson-Hanson algorithm&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#code&#34;&gt;Let&amp;rsquo;s Put Them All Together&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#super&#34;&gt;Let&amp;rsquo;s Super Learn this thing&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#opportunity&#34;&gt;Opportunities for improvement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#lessons&#34;&gt;Lessons learnt&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;what&#34;&gt;What is Super Learner?
  &lt;a href=&#34;#what&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Super Learner is an ensemble machine learning algorithm that optimally combines predictions from multiple candidate algorithms to create a single ensembled model. Rather than selecting a single &amp;ldquo;best&amp;rdquo; model through traditional model selection methods, Super Learner leverages the strengths of various algorithms by creating a weighted average of their predictions. The fundamental insight is elegant: why choose between a random forest, generalized linear model, or gradient boosting machine when you can let the data determine the optimal combination of all three? This approach was introduced by Mark van der Laan and colleagues and has become particularly popular in causal inference and epidemiology, often paired with Targeted Maximum Likelihood Estimation (TMLE) to obtain robust, efficient estimates of causal effects.&lt;/p&gt;




&lt;h2 id=&#34;engine&#34;&gt;What is the algorithm behind Super Learner?
  &lt;a href=&#34;#engine&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;The beauty of Super Learner lies in its theoretical guarantee: it will perform at least as well as the best single algorithm in your library of candidate learners, and often performs substantially better. This property, known as the 
&lt;a href=&#34;https://vanderlaan-lab.org/2019/05/11/adaptive-algorithm-selection-via-the-super-learner/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;oracle inequality&lt;/a&gt;, means that Super Learner asymptotically achieves the lowest possible prediction error among the combinations of the candidate algorithms. 🤔 To be transparent, I don&amp;rsquo;t really understand all these. But, let&amp;rsquo;s move on. The engine behind this is 
&lt;a href=&#34;https://en.wikipedia.org/wiki/Non-negative_least_squares&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Non-Negative Least Squares (NNLS)&lt;/a&gt;, an elegant constrained optimization method that finds the optimal weights for combining your candidate algorithms.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: The Super Learner theory does not require NNLS, but works well in practice and is often much faster than true convex combination optimization, and can be seen in the early work on Stacked Regression by Leo Breiman.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3 id=&#34;nnls&#34;&gt;NNLS
  &lt;a href=&#34;#nnls&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;At its core, NNLS solves a seemingly simple problem: given a matrix &lt;code&gt;X&lt;/code&gt; of predictions from your candidate algorithms and an outcome vector &lt;code&gt;y&lt;/code&gt;, find &lt;code&gt;weights β&lt;/code&gt; that minimize the squared prediction error &lt;code&gt;||y - Xβ||²&lt;/code&gt; subject to two crucial constraints:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;all weights must be non-negative (β ≥ 0)&lt;/li&gt;
&lt;li&gt;&lt;del&gt;the weights must sum to one (ensuring a proper convex combination).&lt;/del&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;Note: Unlike the theoretical Super Learner which requires weights to sum to one (convex combination), NNLS only enforces non-negativity. This relaxation makes the optimization much faster while performing just as well in practice. Thanks to 
&lt;a href=&#34;https://health.uchicago.edu/faculty/eric-polley-phd&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eric Polley&lt;/a&gt; for correcting and educating me on the above. Much appreciated!&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3 id=&#34;lha&#34;&gt;Lawson-Hanson Algorithm
  &lt;a href=&#34;#lha&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;The most commonly used algorithm for solving NNLS is the active set method developed by Lawson and Hanson in 1974. This iterative algorithm is remarkably intuitive: it maintains two sets of variables—an &amp;ldquo;active set&amp;rdquo; of variables currently in the model with positive weights, and a &amp;ldquo;passive set&amp;rdquo; of variables currently excluded (with zero weights). The algorithm begins with all variables in the passive set, then iteratively identifies which passive variable, if added to the active set, would most improve the fit. Once a variable enters the active set, the algorithm solves an unconstrained least squares problem using only the active variables. If any weights become negative during this step, the algorithm removes the most negative variable from the active set and repeats the process. This addition-and-removal dance continues until no passive variables would improve the fit and all active variables have positive weights—at which point we&amp;rsquo;ve found our optimal solution.&lt;/p&gt;
&lt;p&gt;OK, too many words above. Not a fan. 😵‍💫 Lots of procedures above, let&amp;rsquo;s break it down to steps and write a simple example with code to go through the process. Let&amp;rsquo;s create a simple example.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbind&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1.5&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;4.5&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;6&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;6&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p&gt;$$
\begin{gather}
\text{X} =
\begin{bmatrix}
1.5 &amp;amp; 3 &amp;amp; 4 \\
0.5 &amp;amp; 2 &amp;amp; 3 \\
4.5 &amp;amp; 6 &amp;amp; 6
\end{bmatrix}
;
\text{y} =
\begin{bmatrix}
2 \\ 1 \\ 5
\end{bmatrix}
\end{gather}
$$&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s take a quick look at the matrices above. Just glancing at it you would think the weights for each models (columns) should be within column &lt;code&gt;1&lt;/code&gt; and column &lt;code&gt;2&lt;/code&gt;. Let&amp;rsquo;s go through Lawson-hanson algorithm procedure&lt;/p&gt;




&lt;h4 id=&#34;step-0-initialize-your-sets&#34;&gt;Step 0: Initialize Your Sets
  &lt;a href=&#34;#step-0-initialize-your-sets&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;p&gt;Start with all variables in the passive set (R) and none in the active set (P). Like so:&lt;/p&gt;
&lt;p&gt;$$
\text{P} = \emptyset
$$&lt;/p&gt;
&lt;p&gt;$$
R = \{1, 2, 3\}
$$
$$
\beta =
\begin{bmatrix}
0 \\ 0 \\ 0
\end{bmatrix}
$$&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;P&lt;/code&gt; : Active Set (Take note, I used P as active, not passive; also these are indexes)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;R&lt;/code&gt; : Passive Set (take note, these are indexes)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;β&lt;/code&gt; : weights&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We will go throught the iterative procedure below, move Passive set (R) one by one to Active set (P) until we no longer have any passive sets available.&lt;/p&gt;




&lt;h4 id=&#34;step-1-find-gradient&#34;&gt;Step 1 Find Gradient
  &lt;a href=&#34;#step-1-find-gradient&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# step 1: find gradient &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gradient &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;t&lt;/span&gt;(X) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; (y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; beta)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sum&lt;/span&gt;(gradient&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dim&lt;/span&gt;(X)[2]) { &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;stop&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;all gradients are zero or negative, we have achieved optimality&amp;#34;&lt;/span&gt;) }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(R)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) { &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;stop&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;R is empty&amp;#34;&lt;/span&gt;)}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p&gt;$$
\begin{gather}
\text{Gradient} = \text{X}^{\text{T}} \cdot (\text{y} - \text{X}\beta)
\end{gather}
$$
The above is the procedure to find gradient for &lt;code&gt;||y - Xβ||²&lt;/code&gt;. Let&amp;rsquo;s put in the numbers and calculate&lt;/p&gt;
&lt;p&gt;$$
\begin{gather}
\text{Gradient} = \text{X}^{\text{T}} \cdot (\text{y} - \text{X}\beta) \\
= \begin{bmatrix}
1.5 &amp;amp; 3 &amp;amp; 4 \\
0.5 &amp;amp; 2 &amp;amp; 3 \\
4.5 &amp;amp; 6 &amp;amp; 6
\end{bmatrix}^\text{T} \cdot (
\begin{bmatrix}
2 \\ 1 \\ 5
\end{bmatrix} -
\begin{bmatrix}
1.5 &amp;amp; 3 &amp;amp; 4 \\
0.5 &amp;amp; 2 &amp;amp; 3 \\
4.5 &amp;amp; 6 &amp;amp; 6
\end{bmatrix}
\begin{bmatrix}
0 \\ 0 \\ 0
\end{bmatrix}
) \\
= \begin{bmatrix}
1.5 &amp;amp; 0.5 &amp;amp; 4.5 \\
3 &amp;amp; 2 &amp;amp; 6 \\
4 &amp;amp; 3 &amp;amp; 6
\end{bmatrix} \cdot (
\begin{bmatrix}
2 \\ 1 \\ 5
\end{bmatrix} -
\begin{bmatrix}
1.5 &amp;amp; 3 &amp;amp; 4 \\
0.5 &amp;amp; 2 &amp;amp; 3 \\
4.5 &amp;amp; 6 &amp;amp; 6
\end{bmatrix}
\begin{bmatrix}
0 \\ 0 \\ 0
\end{bmatrix}
) \\
= \begin{bmatrix}
1.5 &amp;amp; 0.5 &amp;amp; 4.5 \\
3 &amp;amp; 2 &amp;amp; 6 \\
4 &amp;amp; 3 &amp;amp; 6
\end{bmatrix} \cdot
\begin{bmatrix}
2 \\ 1 \\ 5
\end{bmatrix} \\
= \begin{bmatrix}
26.5 \\ 44 \\ 41
\end{bmatrix}
\end{gather}
$$&lt;/p&gt;




&lt;h4 id=&#34;step-2-check-optimality--find-next-variable-to-add-to-p&#34;&gt;Step 2 Check Optimality &amp;amp; Find Next Variable to Add To P
  &lt;a href=&#34;#step-2-check-optimality--find-next-variable-to-add-to-p&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;p&gt;If all gradients are non-positive, we have achieved optimality. If not, proceed to the next step. Find the index of R of the maximum gradient. In this case, max of &lt;code&gt;26.5, 44, 41&lt;/code&gt; is &lt;code&gt;44&lt;/code&gt;, which is the second column of &lt;code&gt;X&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;$$
\begin{gather}
\text{Next Variable} = \text{argmax}_{j \in R} \text{Gradient}_j \\
= 2
\end{gather}
$$
We then move &lt;code&gt;2&lt;/code&gt; from &lt;code&gt;R&lt;/code&gt; (passive set) to &lt;code&gt;P&lt;/code&gt; (active set) like so:&lt;/p&gt;
&lt;p&gt;$$
\text{P} = \{2\}
$$
$$
R = \{1, 3\}
$$&lt;/p&gt;




&lt;h4 id=&#34;step-3-solve-the-unconstrained-least-squares-problem-for-active-set-p&#34;&gt;Step 3 Solve the Unconstrained Least Squares Problem for Active Set P.
  &lt;a href=&#34;#step-3-solve-the-unconstrained-least-squares-problem-for-active-set-p&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;p&gt;$$
\begin{gather}
\beta_P = (\text{X}_P^{\text{T}} \cdot \text{X}_P)^{-1} \cdot \text{X}_P^{\text{T}} \cdot \text{y} \\
= (\begin{bmatrix}
3 \\
2 \\
6
\end{bmatrix}^{\text{T}} \cdot
\begin{bmatrix}
3 \\
2 \\
6
\end{bmatrix})^{-1} \cdot
\begin{bmatrix}
3 \\
2 \\
6
\end{bmatrix}^{\text{T}} \cdot
\begin{bmatrix}
2 \\
1 \\
5
\end{bmatrix} \\
= (9 + 4 + 36)^{-1} \cdot
\begin{bmatrix}
3 &amp;amp; 2 &amp;amp; 6
\end{bmatrix} \cdot
\begin{bmatrix}
2 \\
1 \\
5
\end{bmatrix} \\
= 49^{-1} \cdot
\begin{bmatrix}
3 \cdot 2 + 2 \cdot 1 + 6 \cdot 5
\end{bmatrix} \\
= 49^{-1} \cdot
\begin{bmatrix}
44
\end{bmatrix} \\
= \begin{bmatrix}
0.8979592
\end{bmatrix}
\end{gather}
$$&lt;/p&gt;
&lt;p&gt;Where &lt;code&gt;X_P&lt;/code&gt; is the sub-matrix of &lt;code&gt;X&lt;/code&gt; containing only the columns in the active set &lt;code&gt;P&lt;/code&gt;. In our case, &lt;code&gt;P = {2}&lt;/code&gt;, so &lt;code&gt;β&lt;/code&gt; is:&lt;/p&gt;
&lt;p&gt;$$
\begin{gather}
\beta =
\begin{bmatrix}
0 \\ 0.8979592 \\ 0
\end{bmatrix}
\end{gather}
$$
Still with me? We went from initializing zero weights (beta) for all 3 to now with the second model having weight of &lt;code&gt;0.89796&lt;/code&gt;&lt;/p&gt;




&lt;h4 id=&#34;step-4-check-for-negative-weights&#34;&gt;Step 4 Check For Negative Weights
  &lt;a href=&#34;#step-4-check-for-negative-weights&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;p&gt;$$
\begin{gather}
\text{If any } \beta_P \leq 0, \text{ calculate } \alpha \text{ else go back to step 1}\\
\alpha = \min_{\beta_P \leq 0} \frac{\beta_{old}}{\beta_{old} - \beta_P} \\
\text{If } \alpha &amp;lt; 1, \text{ update } \beta = \beta_{old} + \alpha (\beta_P - \beta_{old}) \\
\text{Remove any variables from P where } \beta \leq 0 \text{ and return them to R}
\end{gather}
$$
Since our weights &lt;code&gt;$\beta$&lt;/code&gt; cannot be negative, and if we hit a negative value, we want to shift all &lt;code&gt;\(\beta\)&lt;/code&gt; by &lt;code&gt;\(\alpha\)&lt;/code&gt; proportion of the difference and make the calculated negative weight &lt;code&gt;0&lt;/code&gt; and adjust the other weights equally.&lt;/p&gt;
&lt;p&gt;After the above, we go iterate until &lt;code&gt;R&lt;/code&gt; set is empty. You get the point, instead of latex the entire calculation, let&amp;rsquo;s use code to get to our answers.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;P &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;R &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dim&lt;/span&gt;(X)[2])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;beta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rep&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dim&lt;/span&gt;(X)[2])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;while &lt;/span&gt;(&lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# step 1: find gradient &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gradient &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;t&lt;/span&gt;(X) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; (y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; beta)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sum&lt;/span&gt;(gradient&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dim&lt;/span&gt;(X)[2]) { &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;all gradients are zero or negative, we have achieved optimality&amp;#34;&lt;/span&gt;) ; break }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(R)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) { &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;R is empty&amp;#34;&lt;/span&gt;) ; break }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# step 2: check optimality&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gradient_not_active &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; gradient
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gradient_not_active[P] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;Inf&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;P_x &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;which&lt;/span&gt;(gradient_not_active&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;max&lt;/span&gt;(gradient_not_active))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;P &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(P,P_x) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unique&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sort&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;R &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;setdiff&lt;/span&gt;(R, P_x)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# solve P&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;beta_i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;beta_i[P] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;solve&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;t&lt;/span&gt;(X[,P]) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; X[,P]) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;t&lt;/span&gt;(X[,P])&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt;y
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;any&lt;/span&gt;(beta_i&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)) { 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;negative weights: &amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(beta_i, collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;)))  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  idx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;which&lt;/span&gt;(beta_i&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  beta_old &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta[idx]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  beta_new &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta_i[idx]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  alpha &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta_old&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/-&lt;/span&gt;(beta_new&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;beta_old)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  beta_i_new &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; alpha&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;(beta&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;beta_i) 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  beta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta_i_new &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(digits &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;new weights after setting negative weight as zero: &amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(beta,collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;)))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  } else {  beta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta_i ; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(beta) }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] 0.0000000 0.0000000 0.6721311
## [1] 0.8683544 0.0000000 0.1810127
## [1] &amp;#34;negative weights: 0.666666666666675 0.333333333333364 -7.105427357601e-15&amp;#34;
## [1] &amp;#34;new weights after setting negative weight as zero: 0.6667 0.3333 0&amp;#34;
## [1] &amp;#34;R is empty&amp;#34;
&lt;/code&gt;&lt;/pre&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;beta
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] 0.6667 0.3333 0.0000
&lt;/code&gt;&lt;/pre&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; beta
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;##         [,1]
## [1,] 1.99995
## [2,] 0.99995
## [3,] 4.99995
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Wow, it worked! Look at our weights (beta) and our final results! As suspected, column 1 and 2 will have the weights (more on column 1) and when combined our final numbers are quite close to our &lt;code&gt;y&lt;/code&gt;, which is 2, 1, 5 . Awesome! Now, let&amp;rsquo;s simulate more data and see if our code works and compare it with &lt;code&gt;nnls&lt;/code&gt; package!&lt;/p&gt;




&lt;h2 id=&#34;code&#34;&gt;Let&amp;rsquo;s Put Them All Together
  &lt;a href=&#34;#code&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;




&lt;h4 id=&#34;simulate-data&#34;&gt;Simulate Data
  &lt;a href=&#34;#simulate-data&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# labels/outcome/y&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;num_labels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;label_range &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sample&lt;/span&gt;(label_range, num_labels, replace&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# X matrix&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;num_models &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;matrix&lt;/span&gt;(nrow &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; num_labels, ncol &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; num_models)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;num_models) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  sd &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sample&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;0.01&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;10&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(j in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;num_labels) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  X[j, i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, mean &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; y[j], sd &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sd)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Alright, what we did above is basically simulated &lt;code&gt;y&lt;/code&gt; and &lt;code&gt;X&lt;/code&gt;&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;P &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;R &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dim&lt;/span&gt;(X)[2])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;beta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rep&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dim&lt;/span&gt;(X)[2])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;while &lt;/span&gt;(&lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# step 1: find gradient &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gradient &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;t&lt;/span&gt;(X) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; (y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; beta)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sum&lt;/span&gt;(gradient&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dim&lt;/span&gt;(X)[2]) { &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;all gradients are zero or negative, we have achieved optimality&amp;#34;&lt;/span&gt;) ; break }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(R)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) { &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;R is empty&amp;#34;&lt;/span&gt;) ; break }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# step 2: check optimality&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gradient_not_active &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; gradient
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;gradient_not_active[P] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;Inf&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;P_x &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;which&lt;/span&gt;(gradient_not_active&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;max&lt;/span&gt;(gradient_not_active))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;P &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(P,P_x) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unique&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sort&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;R &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;setdiff&lt;/span&gt;(R, P_x)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# solve P&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;beta_i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;beta_i[P] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;solve&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;t&lt;/span&gt;(X[,P]) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; X[,P]) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;t&lt;/span&gt;(X[,P])&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt;y
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;any&lt;/span&gt;(beta_i&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)) { 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;negative weights: &amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(beta_i, collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;)))  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  idx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;which&lt;/span&gt;(beta_i&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  beta_old &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta[idx]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  beta_new &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta_i[idx]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  alpha &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta_old&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/-&lt;/span&gt;(beta_new&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;beta_old)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  beta_i_new &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; alpha&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;(beta&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;beta_i) 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  beta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta_i_new &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(digits &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;new weights after setting negative weight as zero: &amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(beta,collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;)))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  } else {  beta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta_i ; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(beta) }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] 0.000000 0.000000 1.000061 0.000000 0.000000
## [1] 1.289716e-05 0.000000e+00 1.000049e+00 0.000000e+00 0.000000e+00
## [1] &amp;#34;negative weights: -4.80441441441621e-06 0 0.530291436850037 0 0.469820844889452&amp;#34;
## [1] &amp;#34;new weights after setting negative weight as zero: 0 0 0.6578 0 0.3423&amp;#34;
## [1] &amp;#34;negative weights: -4.57899476983191e-06 6.81265663325082e-05 0.530205009482599 0 0.469839806590533&amp;#34;
## [1] &amp;#34;new weights after setting negative weight as zero: 0 0 0.6578 0 0.3423&amp;#34;
## [1] &amp;#34;negative weights: -4.73337830228063e-06 6.51033708645959e-05 0.530467184328906 -5.88823524796493e-06 0.469586269810267&amp;#34;
&lt;/code&gt;&lt;/pre&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## Warning in alpha * (beta - beta_i): longer object length is not a multiple of
## shorter object length
&lt;/code&gt;&lt;/pre&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] &amp;#34;new weights after setting negative weight as zero: 0 0 0.6578 0 0.3423&amp;#34;
## [1] &amp;#34;R is empty&amp;#34;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Let&amp;rsquo;s look at our weights and RMSE&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;beta
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] 0.0000 0.0000 0.6578 0.0000 0.3423
&lt;/code&gt;&lt;/pre&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sqrt&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;((y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; beta)^2))
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] 0.007206879
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Let&amp;rsquo;s look at &lt;code&gt;nnls&lt;/code&gt; package and see if we can the same result&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; nnls&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;nnls&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;X,b&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;y)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;model
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## Nonnegative least squares model
## x estimates: 0 6.970084e-05 0.53029 0 0.4697488 
## residual sum-of-squares: 0.04886
## reason terminated: The solution has been computed sucessfully.
&lt;/code&gt;&lt;/pre&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sqrt&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;((y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;x)^2))
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] 0.006989728
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;wow! Awesome!!! Looks the same or at least very similar. Alright, now we&amp;rsquo;re at least able to reproduce the nnls portion from scratch. Let&amp;rsquo;s see if we can simulate a non-linear data and train with different models and see how our end result is!&lt;/p&gt;




&lt;h3 id=&#34;super&#34;&gt;Let&amp;rsquo;s Super Learn this thing
  &lt;a href=&#34;#super&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Click below at &lt;code&gt;code&lt;/code&gt; to expand for the entire procedures. We basically ran 3 different models in tidymodels, linear regression, xgboost, and random forest with recipe (y ~ .), not specifying any interaction/polynomial relationships for a simulated data below.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;x &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;w &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;x)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;x &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;w &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;w &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.05&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;x^2
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Made sure to set seed for reproducibility, create 5 fold for cross validation. Then extract all the prediction for validation sets from each models and stack them into &lt;code&gt;X&lt;/code&gt; matrix. Then extract the RMSE from each models and stack them into &lt;code&gt;metrics&lt;/code&gt; matrix. Finally, we run our nnls code above to get the weights and RMSE for super learner. We repeat this for 1000 iterations and log the results.&lt;/p&gt;
&lt;p&gt;This cross-validation step is the defining feature of the Super Learner. By fitting each base learner on training folds and generating out-of-fold predictions, we obtain an unbiased prediction matrix that is then used to estimate optimal ensemble weights via NNLS.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(tidymodels)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(future)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(furrr)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Set up parallel processing&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plan&lt;/span&gt;(multisession, workers &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;availableCores&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Define the function to run for each iteration&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;run_iteration &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(i) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  x &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  w &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;x)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;x &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;w &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;w &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.05&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;x^2
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(x,w,y)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  split &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;initial_split&lt;/span&gt;(df)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;training&lt;/span&gt;(split)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  test &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;testing&lt;/span&gt;(split)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# preprocess&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  rec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;recipe&lt;/span&gt;(y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; ., data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;train) 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# linear regression&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  lr_spec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;linear_reg&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  wf &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;workflow&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_recipe&lt;/span&gt;(rec) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_model&lt;/span&gt;(lr_spec)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  folds &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vfold_cv&lt;/span&gt;(train, &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  cv_results &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; wf &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;fit_resamples&lt;/span&gt;(folds, control &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;control_resamples&lt;/span&gt;(save_pred &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  cv_metrics &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;collect_metrics&lt;/span&gt;(cv_results) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(.metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rmse&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(mean)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  cv_preds &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;collect_predictions&lt;/span&gt;(cv_results) 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#xgboost&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  xgb_spec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;boost_tree&lt;/span&gt;(engine &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgboost&amp;#34;&lt;/span&gt;, mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;regression&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  wf &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;workflow&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_recipe&lt;/span&gt;(rec) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_model&lt;/span&gt;(xgb_spec)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  cv_results &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; wf &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;fit_resamples&lt;/span&gt;(folds, control &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;control_resamples&lt;/span&gt;(save_pred &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  cv_metrics2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;collect_metrics&lt;/span&gt;(cv_results) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(.metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rmse&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(mean)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  cv_preds2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;collect_predictions&lt;/span&gt;(cv_results) 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# random forest&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  rf_spec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rand_forest&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;regression&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  wf &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;workflow&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_recipe&lt;/span&gt;(rec) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_model&lt;/span&gt;(rf_spec)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  cv_results &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; wf &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;fit_resamples&lt;/span&gt;(folds, control &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;control_resamples&lt;/span&gt;(save_pred &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  cv_metrics3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;collect_metrics&lt;/span&gt;(cv_results) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(.metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rmse&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(mean)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  cv_preds3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;collect_predictions&lt;/span&gt;(cv_results) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rf&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;cbind&lt;/span&gt;(cv_preds &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(X1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;.pred),cv_preds2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(X2&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;.pred), cv_preds3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(X3&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;.pred)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.matrix&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; cv_preds &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(y) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.matrix&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  metrics &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;cbind&lt;/span&gt;(cv_metrics,cv_metrics2,cv_metrics3)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# nnls&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  P &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  R &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dim&lt;/span&gt;(X)[2])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  beta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rep&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dim&lt;/span&gt;(X)[2])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;while &lt;/span&gt;(&lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# step 1: find gradient &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    gradient &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;t&lt;/span&gt;(X) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; (y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; beta)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sum&lt;/span&gt;(gradient&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;dim&lt;/span&gt;(X)[2]) { &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;all gradients are zero or negative, we have achieved optimality&amp;#34;&lt;/span&gt;) ; break }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(R)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) { &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;R is empty&amp;#34;&lt;/span&gt;) ; break }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# step 2: check optimality&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    gradient_not_active &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; gradient
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    gradient_not_active[P] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;Inf&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    P_x &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;which&lt;/span&gt;(gradient_not_active&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;max&lt;/span&gt;(gradient_not_active))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    P &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(P,P_x) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unique&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sort&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    R &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;setdiff&lt;/span&gt;(R, P_x)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# solve P&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    beta_i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    beta_i[P] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;solve&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;t&lt;/span&gt;(X[,P]) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; X[,P]) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;t&lt;/span&gt;(X[,P])&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt;y
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;any&lt;/span&gt;(beta_i&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)) { 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;negative weights: &amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(beta_i, collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;)))  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      idx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;which&lt;/span&gt;(beta_i&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      beta_old &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta[idx]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      beta_new &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta_i[idx]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      alpha &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta_old&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/-&lt;/span&gt;(beta_new&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;beta_old)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      beta_i_new &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; alpha&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;(beta&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;beta_i) 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      beta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta_i_new &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(digits &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;new weights after setting negative weight as zero: &amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(beta,collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;)))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    } else {  beta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; beta_i ; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(beta) }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  rmse_superlearner &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sqrt&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;((y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; beta)^2))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  rmse_result &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sum&lt;/span&gt;(metrics &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;&lt;/span&gt; rmse_superlearner) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;) { &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;solo_better&amp;#34;&lt;/span&gt; } else { &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;superlearner_better&amp;#34;&lt;/span&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; nnls&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;nnls&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;X,b&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;y)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  rmse_ours_nnls &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(rmse_superlearner, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sqrt&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;((y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; X &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%*%&lt;/span&gt; model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;x)^2)))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  same_weights_result &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sum&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(beta, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;x, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;) { &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;same&amp;#34;&lt;/span&gt; } else { &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;not_same&amp;#34;&lt;/span&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  weights_log &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;x, beta)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(rmse_log &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; rmse_result, same_weights_log &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; same_weights_result, weights_log&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;weights_log))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Run with future_map&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;results &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;future_map&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;, run_iteration, .options &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;furrr_options&lt;/span&gt;(seed &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;), .progress &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;




&lt;h4 id=&#34;lets-compare-rmse-of-solo-models-vs-super-learner-models&#34;&gt;Let&amp;rsquo;s Compare RMSE of Solo models vs Super Learner models
  &lt;a href=&#34;#lets-compare-rmse-of-solo-models-vs-super-learner-models&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(tidyverse)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Extract results&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;rmse_log &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;map_chr&lt;/span&gt;(results, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;rmse_log&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;same_weights_log &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;map_chr&lt;/span&gt;(results, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;same_weights_log&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;weight_logs &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;matrix&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;NA&lt;/span&gt;, ncol &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;6&lt;/span&gt;, nrow &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  weight_logs[i, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;6&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; results[[i]]&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;weights_log
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plotrmse_log &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(rmse&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;rmse_log) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggplot&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;rmse)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_bar&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme_bw&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/superlearner/index_files/figure-html/unnamed-chunk-12-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Wow, look at that! superlearner/ensembled model does appear to have better RMSE compared to solo models! Let&amp;rsquo;s take a look and see if our noob nnls from scratch is comparable with &lt;code&gt;nnls&lt;/code&gt; package.&lt;/p&gt;




&lt;h4 id=&#34;comparing-our-nnls-to-nnls-package&#34;&gt;Comparing Our NNLS to &lt;code&gt;nnls&lt;/code&gt; package
  &lt;a href=&#34;#comparing-our-nnls-to-nnls-package&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plotsameweights &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(same_weights&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;same_weights_log) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggplot&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;same_weights)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_bar&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme_bw&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/superlearner/index_files/figure-html/unnamed-chunk-14-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Wow, most of the weights are the same if we round up to 4 digits! Let&amp;rsquo;s check on the ones with difference, is it REALLY that different?&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plotdiff123 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; weight_logs &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as_tibble&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(diff1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V4&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;V1, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         diff2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V5&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;V2, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         diff3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V6&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;V3, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         sum_diff &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; diff1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;diff2&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;diff3) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(sum_diff &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pivot_longer&lt;/span&gt;(cols &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(diff1,diff2,diff3), names_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;diff&amp;#34;&lt;/span&gt;, values_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;values&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggplot&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;values,fill&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;diff)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_histogram&lt;/span&gt;(position &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;dodge2&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme_bw&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/superlearner/index_files/figure-html/unnamed-chunk-16-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;This makes sense, most of the differences are between &lt;code&gt;xgboost&lt;/code&gt; (diff2) and &lt;code&gt;random forest&lt;/code&gt; (diff3), as our linear regression (diff1) model without correct specification probably won&amp;rsquo;t have a whole of contributions, hence if there is a difference between our algorithm and &lt;code&gt;nnls&lt;/code&gt;, it would be minimal (center in red). It also make sense that if there is a difference in xgboost or random forest model, we would see different weight on the other model contribution. Now the question is, with these weight differences, does it make a huge difference in RMSE? I suspect not so much.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;rmse_compare &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;matrix&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;NA&lt;/span&gt;, ncol &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;, nrow &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  rmse_compare[i,&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; results[[i]]&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;rmse_ours_nnls 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plotcompare &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; weight_logs &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as_tibble&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(row &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;row_number&lt;/span&gt;()) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(diff1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V4&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;V1, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         diff2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V5&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;V2, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         diff3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V6&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;V3, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         sum_diff &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; diff1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;diff2&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;diff3) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(sum_diff &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;left_join&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as_tibble&lt;/span&gt;(rmse_compare) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(row &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;row_number&lt;/span&gt;()), by &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;row&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(V1.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V1.y, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         V2.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V2.y, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(check &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    V1.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; V2.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;same&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    V1.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;&lt;/span&gt; V2.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;our_nnls_better&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    V1.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;&lt;/span&gt; V2.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;nnls_package_better&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;NA_character_&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggplot&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;check)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_bar&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme_bw&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/superlearner/index_files/figure-html/unnamed-chunk-18-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;lol, &lt;code&gt;nnls&lt;/code&gt; package clearly is better than our home-grown algorithm! But by how much?&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plotdiffrmse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; weight_logs &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as_tibble&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(row &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;row_number&lt;/span&gt;()) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(diff1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V4&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;V1, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         diff2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V5&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;V2, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         diff3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V6&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;V3, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         sum_diff &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; diff1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;diff2&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;diff3) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(sum_diff &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;left_join&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as_tibble&lt;/span&gt;(rmse_compare) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(row &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;row_number&lt;/span&gt;()), by &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;row&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(V1.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V1.y, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         V2.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V2.y, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(diff_rmse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; V1.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; V2.y) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggplot&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(diff_rmse)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_histogram&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme_bw&lt;/span&gt;() 
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/superlearner/index_files/figure-html/unnamed-chunk-20-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;😵‍💫 It&amp;rsquo;s really not that much different! Let&amp;rsquo;s find the max.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;weight_logs &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as_tibble&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(row &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;row_number&lt;/span&gt;()) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(diff1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V4&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;V1, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         diff2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V5&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;V2, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         diff3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V6&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;V3, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         sum_diff &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; diff1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;diff2&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;diff3) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(sum_diff &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;left_join&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as_tibble&lt;/span&gt;(rmse_compare) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(row &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;row_number&lt;/span&gt;()), by &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;row&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(V1.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V1.y, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         V2.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(V2.y, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(diff_rmse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; V1.y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; V2.y) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(diff_rmse) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;max&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] 9e-04
&lt;/code&gt;&lt;/pre&gt;&lt;/details&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] 9e-04
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;🥹 Does that mean our home-grown algorithm works just as well? You be the judge. Let me know if this is due to pure luck!&lt;/p&gt;




&lt;h2 id=&#34;acknowledgement&#34;&gt;Acknowledgement:
  &lt;a href=&#34;#acknowledgement&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Thanks 
&lt;a href=&#34;https://health.uchicago.edu/faculty/eric-polley-phd&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eric Polley&lt;/a&gt; for correcting and educating me on that NNLS does not require the beta coefficients sum up to 1 (only non-negative). Also The Super Learner theory does not require NNLS, but works well in practice and is often much faster than true convex combination optimization, and can be seen in the early work on Stacked Regression by Leo Breiman. Much appreciated!&lt;/p&gt;




&lt;h2 id=&#34;opportunity&#34;&gt;Opportunities for improvement
  &lt;a href=&#34;#opportunity&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;will try multicore sometime in the future, is it really faster than multisession?&lt;/li&gt;
&lt;li&gt;need to learn/figure out FAST nnls algorithm which I believe &lt;code&gt;nnls&lt;/code&gt; package uses&lt;/li&gt;
&lt;li&gt;need to venture more in parallel computing&lt;/li&gt;
&lt;li&gt;compare with the actual &lt;code&gt;SuperLearner&lt;/code&gt; package&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;lessons&#34;&gt;Lessons learnt
  &lt;a href=&#34;#lessons&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;learnt to build Super Learner using non-negative least square model&lt;/li&gt;
&lt;li&gt;learnt Lawson-Hanson algorithm and how it&amp;rsquo;s implemented, compared with &lt;code&gt;nnls&lt;/code&gt; and results not too shabby!&lt;/li&gt;
&lt;li&gt;learnt some basics of parallel computing&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you like this article:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;please feel free to send me a 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;comment or visit my other blogs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;please feel free to follow me on 
&lt;a href=&#34;https://bsky.app/profile/kenkoonwong.bsky.social&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;BlueSky&lt;/a&gt;, 
&lt;a href=&#34;https://twitter.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;twitter&lt;/a&gt;, 
&lt;a href=&#34;https://github.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GitHub&lt;/a&gt; or 
&lt;a href=&#34;https://med-mastodon.com/@kenkoonwong&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Mastodon&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;if you would like collaborate please feel free to 
&lt;a href=&#34;https://www.kenkoonwong.com/contact/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;contact me&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>Bias, Variance, and Doubly Robust Estimation: Testing The Promise of TMLE in Simulated Data</title>
      <link>https://www.kenkoonwong.com/blog/tmle/</link>
      <pubDate>Sun, 16 Nov 2025 00:00:00 +0000</pubDate>
      
      <guid>https://www.kenkoonwong.com/blog/tmle/</guid>
      <description>&lt;script src=&#34;https://www.kenkoonwong.com/blog/tmle/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/tmle/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;https://www.kenkoonwong.com/blog/tmle/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/tmle/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;https://www.kenkoonwong.com/blog/tmle/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/tmle/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;blockquote&gt;
&lt;p&gt;Finally understood TMLE&amp;rsquo;s &amp;ldquo;doubly robust&amp;rdquo; property through simulation. Works well when either outcome OR treatment model is correct. XGBoost + TMLE captured complex relationships without manual specification. But beware the confidence intervals - Frank Harrell was right to say &amp;ldquo;prepare to be disappointed!&amp;rdquo; 🤔&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src=&#34;dr.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;




&lt;h2 id=&#34;motivations&#34;&gt;Motivations:
  &lt;a href=&#34;#motivations&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;I&amp;rsquo;ve always heard about Targeted Maximum Likelihood Estimation (TMLE) and I&amp;rsquo;ve read 
&lt;a href=&#34;https://github.com/kathoffman/causal-inference-visual-guides/blob/master/visual-guides/TMLE.pdf&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Katherine Hoffman&lt;/a&gt;&amp;rsquo;s blog post several times. Printed her cheat sheet and go through it several times. Each time I thought I understood it, the next time I found myself questioning my understanding. 🤣 So, what a better way to dive a tad deeper as to the machinery behind this, and why is it useful? Let&amp;rsquo;s go!&lt;/p&gt;
&lt;p&gt;Just to set the context right, we&amp;rsquo;re going to estimate Average Treatment Effect (ATE) and use &lt;code&gt;g-computation&lt;/code&gt; as a standard approach.&lt;/p&gt;




&lt;h2 id=&#34;objectives&#34;&gt;Objectives:
  &lt;a href=&#34;#objectives&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#tmle&#34;&gt;What is TMLE?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#dr&#34;&gt;What Does Doubly Robust mean?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;biasvariance&#34;&gt;What is Bias and Variance?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#data&#34;&gt;Simulate Data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#estimate&#34;&gt;Write a Function to Estimate&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#wrong_outcome&#34;&gt;Wrong outcome model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#gcomp&#34;&gt;G-computation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#correct_outcome&#34;&gt;Correct outcome model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#wrong_ps_model&#34;&gt;Wrong treatment model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#correct_treatment&#34;&gt;Correct treatment model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#tmle_procedure&#34;&gt;TMLE Steps&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#models&#34;&gt;Comparing methods and models&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#traditional&#34;&gt;Is there a traditional statistical method for this?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#tails&#34;&gt;But Does It Really Work?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#ack&#34;&gt;Acknowledgement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#opportunity&#34;&gt;Opportunities for improvement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#lessons&#34;&gt;Lessons learnt&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;tmle&#34;&gt;What is TMLE?
  &lt;a href=&#34;#tmle&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;TMLE is a statistical method used for estimating causal effects in observational studies and clinical trials. It combines elements of machine learning and traditional statistical techniques to provide robust estimates of treatment effects while controlling for confounding variables. TMLE operates in two main steps: first, it estimates the outcome model and the treatment model, and then it uses these models to adjust the treatment effect estimate, targeting the parameter of interest directly. This approach is particularly useful in settings where standard methods may be biased or inefficient, as it allows for the incorporation of flexible machine learning algorithms to improve estimation accuracy. You will hear the term Doubly Robust about this method. What&amp;rsquo;s do robust x 2 about this?&lt;/p&gt;




&lt;h2 id=&#34;dr&#34;&gt;What Does Doubly Robust mean?
  &lt;a href=&#34;#dr&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Doubly Robust (DR) estimation refers to a statistical property of certain estimators that remain consistent if either the model for the treatment assignment (propensity score) or the model for the outcome is correctly specified, but not necessarily both. In other words, a doubly robust estimator provides two chances for obtaining a valid estimate of the causal effect: if one of the models is misspecified, as long as the other model is correctly specified, the estimator will still yield consistent results. This property is particularly advantageous in observational studies where there may be uncertainty about the correct specification of either model, enhancing the reliability of causal inferences drawn from the data. I didn&amp;rsquo;t quite understand this until we simulated the data to test this theory. It will, hopefully, be more clear when we go through the simulation. But, wait, what metrics should we use for this? Bias and variance!&lt;/p&gt;




&lt;h2 id=&#34;biasvariance&#34;&gt;What is Bias and Variance?
  &lt;a href=&#34;#biasvariance&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Bias and variance are two fundamental concepts in statistics and machine learning that describe different sources of error in predictive models. Bias refers to the systematic error that occurs when a model consistently overestimates or underestimates the true value of a parameter. High bias can lead to underfitting, where the model fails to capture the underlying patterns in the data. Variance, on the other hand, refers to the variability of model predictions for different training datasets. High variance can lead to overfitting, where the model captures noise in the training data rather than the true signal. The trade-off between bias and variance is a key consideration in model selection and evaluation, as it affects the overall accuracy and generalizability of predictive models.&lt;/p&gt;
&lt;p&gt;The formula for bias is:
$$
\text{Bias}(\hat{\theta}) = E[\hat{\theta}] - \theta
$$
Where:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;\(\hat{\theta}\)&lt;/code&gt; is the estimator of the parameter &lt;code&gt;\(\theta\)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;\(E[\hat{\theta}]\)&lt;/code&gt; is the expected value of the estimator&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In pseudo-R code would look something like this:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predicted_theta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  training_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; dplyr&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(original_training_data, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;nrow&lt;/span&gt;(original_training_data), replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(outcome&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt;treatment&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;confounder,data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;training_data)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome_hat_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model,newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; training_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(treatment &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome_hat_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model,newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; training_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(treatment &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  predicted_theta[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(outcome_hat_1) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(outcome_hat_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;bias &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(predicted_theta) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; theta
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;In my own language, bias is, how close our &lt;code&gt;estimation on average&lt;/code&gt; is to the true value.&lt;/p&gt;
&lt;p&gt;The formula for variance is:
$$
\text{Var}(\hat{\theta}) = E[(\hat{\theta} - E[\hat{\theta}])^2]
$$&lt;/p&gt;
&lt;p&gt;Where:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;\(\hat{\theta}\)&lt;/code&gt; is the estimator of the parameter &lt;code&gt;\(\theta\)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;\(E[\hat{\theta}]\)&lt;/code&gt; is the expected value of the estimator&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In pseudo-R code would look something like this:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predicted_theta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  training_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; dplyr&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(original_training_data, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;nrow&lt;/span&gt;(original_training_data), replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(outcome&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt;treatment&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;confounder,data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;training_data)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome_hat_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model,newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; training_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(treatment &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  outcome_hat_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model,newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; training_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(treatment &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  predicted_theta[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(outcome_hat_1) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(outcome_hat_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;variance &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;((predicted_theta&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(predicted_theta))^2)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# or &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;variance &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;var&lt;/span&gt;(predicted_theta) 
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We will be using &lt;code&gt;bias&lt;/code&gt; and &lt;code&gt;variance&lt;/code&gt; to test the doubly robust theory. But first, let&amp;rsquo;s simulate some data!&lt;/p&gt;




&lt;h2 id=&#34;data&#34;&gt;Simulate Data
  &lt;a href=&#34;#data&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(tidyverse)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W4 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# TRUE propensity score model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-0.5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.8&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# TRUE outcome model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Calculate TRUE ATE&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;logit_Y1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;logit_Y0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y1_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(logit_Y1)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y0_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(logit_Y0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(Y1_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; Y0_true)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W1, W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W2, W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W3, W4 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W4, A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; A, Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; Y,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;           true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; true_ATE, Y1_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; Y1_true, Y0_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; Y0_true)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Alright, our true ATE here is 0.0373518. We&amp;rsquo;ll see if doubly robust method can be able to estimate this either outcome or treatment model is misspecified.&lt;/p&gt;




&lt;h2 id=&#34;estimate&#34;&gt;Write a Function to Estimate
  &lt;a href=&#34;#estimate&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;




&lt;h3 id=&#34;wrong_outcome&#34;&gt;Let&amp;rsquo;s look at the WRONG Outcome model ❌
  &lt;a href=&#34;#wrong_outcome&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W4, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summary&lt;/span&gt;(model)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## 
## Call:
## glm(formula = Y ~ A + W1 + W2 + W3 + W4, family = &amp;#34;binomial&amp;#34;)
## 
## Coefficients:
##              Estimate Std. Error z value Pr(&amp;gt;|z|)    
## (Intercept) -1.045765   0.041489 -25.206   &amp;lt;2e-16 ***
## A           -0.050142   0.047732  -1.050    0.293    
## W1           0.767386   0.026058  29.449   &amp;lt;2e-16 ***
## W2          -0.024726   0.022807  -1.084    0.278    
## W3           0.561572   0.045658  12.300   &amp;lt;2e-16 ***
## W4          -0.003209   0.022382  -0.143    0.886    
## ---
## Signif. codes:  0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 12737  on 9999  degrees of freedom
## Residual deviance: 11519  on 9994  degrees of freedom
## AIC: 11531
## 
## Number of Fisher Scoring iterations: 4
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Ouch. Looking quickly at the coefficient of A is -0.0501418. Totally inverse of the true ATE. Alright let&amp;rsquo;s look at &lt;code&gt;g-computation&lt;/code&gt; and see if it returns the same result.&lt;/p&gt;




&lt;h3 id=&#34;gcomp&#34;&gt;g-computation function
  &lt;a href=&#34;#gcomp&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;g_comp &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(model,data,ml&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(ml&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   y1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model, new_data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;   y0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model, new_data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  } else {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  y1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model, newdata&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;response&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  y0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model, newdata&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;response&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(y1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;y0))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;g_comp&lt;/span&gt;(model,df)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] -0.009823307
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Yup, incorrect! Now what if we use the RIGHT Outcome model?&lt;/p&gt;




&lt;h3 id=&#34;correct_outcome&#34;&gt;The correct outcome model ✅
  &lt;a href=&#34;#correct_outcome&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;I&lt;/span&gt;(W2^2) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W4, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;g_comp&lt;/span&gt;(model,df)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] 0.03576854
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Wow! Look at that! if we correctly specify the outcome model, it actually is VERY close to true ATE!&lt;/p&gt;




&lt;h3 id=&#34;wrong_ps_model&#34;&gt;The wrong treatment model ❌ 
  &lt;a href=&#34;#wrong_ps_model&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Now what if we use &lt;code&gt;IPW&lt;/code&gt; but with the wrong treatment model and see if we can estimate ATE&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps_model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W4, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ps_model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;fitted.values
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmax&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmin&lt;/span&gt;(ps, &lt;span style=&#34;color:#099&#34;&gt;0.95&lt;/span&gt;), &lt;span style=&#34;color:#099&#34;&gt;0.05&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;weights &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;ps_final, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;ps_final))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; A, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;, weights &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; weights)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;g_comp&lt;/span&gt;(model,df)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] -0.0095777
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Wow, very wrong indeed! Now let&amp;rsquo;s look at the right treatment model&lt;/p&gt;




&lt;h3 id=&#34;correct_treatment&#34;&gt;The Correct treatment model ✅
  &lt;a href=&#34;#correct_treatment&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps_model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;I&lt;/span&gt;(W2^2) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W4, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#-0.5 + 0.8*W1 + 0.5*W2^2 + 0.3*W3 - 0.4*W1*W2 + 0.2*W4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ps_model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;fitted.values
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmax&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmin&lt;/span&gt;(ps, &lt;span style=&#34;color:#099&#34;&gt;0.95&lt;/span&gt;), &lt;span style=&#34;color:#099&#34;&gt;0.05&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;weights &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;ps_final, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;ps_final))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; A, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;, weights &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; weights)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;g_comp&lt;/span&gt;(model,df)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] 0.03874379
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Not too shabby! very close to our true ATE! To be honest, how on earth are we supposed to know before hand the complex equation to specify on either treatment or outcome model !?&lt;/p&gt;




&lt;h3 id=&#34;ML&#34;&gt;Let&amp;rsquo;s Try ML xgboost
  &lt;a href=&#34;#ML&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Let&amp;rsquo;s see if xgboost can tease out outcome model without us specifying all these weird interactions and quadratic relationships.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(tidymodels)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(doParallel)  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(future)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;workers &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; parallel&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;detectCores&lt;/span&gt;(logical &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plan&lt;/span&gt;(multisession, workers &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; workers)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;future&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;nbrOfWorkers&lt;/span&gt;() 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Set up parallel processing&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;all_cores &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; parallel&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;detectCores&lt;/span&gt;(logical &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;cl &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;makePSOCKcluster&lt;/span&gt;(all_cores &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Leave 1 core free&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;registerDoParallel&lt;/span&gt;(cl)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df_ml &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(Y,A,W1,W2,W3,W4) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(Y),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(A))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Define model specification&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_spec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;boost_tree&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  trees &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  tree_depth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  min_n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  loss_reduction &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# sample_size = tune(),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mtry &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  learn_rate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set_engine&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgboost&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set_mode&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;classification&amp;#34;&lt;/span&gt;)  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Create workflow&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_wf &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;workflow&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_model&lt;/span&gt;(xgb_spec) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_formula&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; .)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Tuning grid&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;grid_space_filling&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tree_depth&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;min_n&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;loss_reduction&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mtry&lt;/span&gt;(),df_ml),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;learn_rate&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  size &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;20&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Cross-validation and tuning&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;folds &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vfold_cv&lt;/span&gt;(df_ml, v &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_res &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune_grid&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  xgb_wf,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  resamples &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; folds,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; xgb_grid,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  control &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;control_grid&lt;/span&gt;(save_pred &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                         parallel_over &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;everything&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Select best model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;best_xgb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select_best&lt;/span&gt;(xgb_res, metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;roc_auc&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Finalize and fit&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_xgb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize_workflow&lt;/span&gt;(xgb_wf, best_xgb)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_fit &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;fit&lt;/span&gt;(final_xgb, data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; df_ml)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# g-comp&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;g_comp&lt;/span&gt;(final_fit, df_ml, &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] 0.03447109
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Wow, nice! ✅ Quite close to our true ATE without specifying any interactions or quadratic relationship. Mind you, this dataset is quite large.&lt;/p&gt;
&lt;p&gt;Now, let&amp;rsquo;s try out if we can use &lt;code&gt;xgboost&lt;/code&gt; to create an accurate treatment model and use its weights to plug into our good trust &lt;code&gt;glm&lt;/code&gt;.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Rereate workflow&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_wf &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;workflow&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_model&lt;/span&gt;(xgb_spec) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_formula&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; .)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Tuning grid&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;grid_space_filling&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tree_depth&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;min_n&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;loss_reduction&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mtry&lt;/span&gt;(),df_ml &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;Y)),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;learn_rate&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  size &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;20&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Cross-validation and tuning&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;folds &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vfold_cv&lt;/span&gt;(df_ml &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;Y), v &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_res &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune_grid&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  xgb_wf,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  resamples &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; folds,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; xgb_grid,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  control &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;control_grid&lt;/span&gt;(save_pred &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                         parallel_over &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;everything&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Select best model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;best_xgb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select_best&lt;/span&gt;(xgb_res, metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;roc_auc&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Finalize and fit&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_xgb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize_workflow&lt;/span&gt;(xgb_wf, best_xgb)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_fit &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;fit&lt;/span&gt;(final_xgb, data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; df_ml &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;Y))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# calc ps&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; df_ml &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;Y), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ps
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;weights &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;ps_final, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;ps_final))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# glm model &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; A, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;, weights &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; weights)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;g_comp&lt;/span&gt;(model, df_ml)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] 0.04840169
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Wow, compared to 
&lt;a href=&#34;#wrong_ps_model&#34;&gt;this&lt;/a&gt;, our ATE is much closer to our true ATE than the wrongly specified treatment model. Though it&amp;rsquo;s still quite biased, isn&amp;rsquo;t it? it&amp;rsquo;s far from the true ATE. But at least we know ML methods can probably handle these complex relationhip.&lt;/p&gt;
&lt;p&gt;Now let&amp;rsquo;s write function to estimate bias and variance! Since that&amp;rsquo;s our major question. And then we&amp;rsquo;ll look into TMLE procedure.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;bias &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(predicted_theta, theta) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(predicted_theta &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; theta))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;variance &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(predicted_theta) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;var&lt;/span&gt;(predicted_theta))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;




&lt;h3 id=&#34;tmle_procedure&#34;&gt;TMLE
  &lt;a href=&#34;#tmle_procedure&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Since we know xgboost is able to estimate the correct outcome model, why don&amp;rsquo;t we just use logistic regression here. Let&amp;rsquo;s imagine we somehow got only treatment model correct, but not the outcome model, will TMLE be able to tease this out?&lt;/p&gt;




&lt;h4 id=&#34;step-1-create-outcome-model&#34;&gt;Step 1. Create Outcome Model
  &lt;a href=&#34;#step-1-create-outcome-model&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;model_outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W4, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#wrong&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;model_outcome_all &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model_outcome, newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; df, type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;response&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;model_outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model_outcome, newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;response&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;model_outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model_outcome, newdata &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;response&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4 id=&#34;step-2-create-treatment-model--clever-covariate&#34;&gt;Step 2. Create Treatment Model &amp;amp; Clever Covariate
  &lt;a href=&#34;#step-2-create-treatment-model--clever-covariate&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;model_treatment &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;I&lt;/span&gt;(W2^2) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W4, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#-0.5 + 0.8*W1 + 0.5*W2^2 + 0.3*W3 - 0.4*W1*W2 + 0.2*W4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; model_treatment&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;fitted.values
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ps
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# ps_final &amp;lt;- pmax(pmin(ps, 0.95), 0.05)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;a_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model_treatment, df, type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;response&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;a_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model_treatment, df, type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;response&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;clever_covariate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;ps_final, &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;ps_final))
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4 id=&#34;step-3-estimate-fluctuation-parameter&#34;&gt;Step 3. Estimate Fluctuation Parameter
  &lt;a href=&#34;#step-3-estimate-fluctuation-parameter&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;epsilon_model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;offset&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(model_outcome&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;fitted.values)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; clever_covariate, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summary&lt;/span&gt;(epsilon_model)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## 
## Call:
## glm(formula = Y ~ -1 + offset(qlogis(model_outcome$fitted.values)) + 
##     clever_covariate, family = &amp;#34;binomial&amp;#34;)
## 
## Coefficients:
##                  Estimate Std. Error z value Pr(&amp;gt;|z|)    
## clever_covariate 0.041886   0.009675   4.329  1.5e-05 ***
## ---
## Signif. codes:  0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 11519  on 10000  degrees of freedom
## Residual deviance: 11497  on  9999  degrees of freedom
## AIC: 11499
## 
## Number of Fisher Scoring iterations: 4
&lt;/code&gt;&lt;/pre&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;epsilon &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; epsilon_model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;coefficients
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4 id=&#34;step-4-update-initial-outcomes&#34;&gt;Step 4. Update Initial Outcomes
  &lt;a href=&#34;#step-4-update-initial-outcomes&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;updated_outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(model_outcome_1)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;epsilon&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;a_1)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;updated_outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(model_outcome_0)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;epsilon&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;a_0)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4 id=&#34;step-5-compute-ate&#34;&gt;Step 5. Compute ATE
  &lt;a href=&#34;#step-5-compute-ate&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;(ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(updated_outcome_1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;updated_outcome_0))
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] 0.04068321
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Wow, imagine specifying the wrong outcome model, but the right treatment model will bring our ATE to closer to the true ATE! Imagine our wrongly specified outcoem model would have produced ATE of -0.009 
&lt;a href=&#34;#wrong_outcome&#34;&gt;here&lt;/a&gt;.of Not too shabby!&lt;/p&gt;




&lt;h4 id=&#34;step-6-estimate-standard-error&#34;&gt;Step 6. Estimate Standard Error
  &lt;a href=&#34;#step-6-estimate-standard-error&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;updated_outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, updated_outcome_1, updated_outcome_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sqrt&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;var&lt;/span&gt;((Y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;updated_outcome)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;clever_covariate&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;updated_outcome_1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;updated_outcome_0&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;ate)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;pval &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pnorm&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;abs&lt;/span&gt;(ate&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;se)))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ATE: &amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(ate,&lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;), &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; [95%CI &amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(ate&lt;span style=&#34;color:#099&#34;&gt;-1.96&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;se,&lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;),&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;-&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(ate&lt;span style=&#34;color:#099&#34;&gt;+1.96&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;se,&lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;),&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;, p=&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(pval,&lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;),&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;]&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;## [1] &amp;#34;ATE: 0.041 [95%CI 0.021-0.06, p=0]&amp;#34;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;There you have it! Our final estimates, standard error, and pval. Thanks to 
&lt;a href=&#34;https://www.khstats.com/blog/tmle/tutorial-pt2#step-6-calculate-the-standard-errors-for-confidence-intervals-and-p-values&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;khstats&lt;/a&gt; for step by step guidance. Very helpful to reproduce the framework.&lt;/p&gt;




&lt;h2 id=&#34;models&#34;&gt;Comparing Models
  &lt;a href=&#34;#models&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Now, let&amp;rsquo;s resample 1000 times with n = 6000 (after sample size calculation of the effect we have with 80% power and alpha 5%) of 1) correctly specified logistic regression outcome model. 2) an inverse probability weighting (IPW) approach with correctly specified treatment assignment probabilities . 3) incorrectly specific logistic regression outcome model. 4) Hyperparameter-tuned xgboost outcome model and see what how their biases and variances differ.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;My messy code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n_sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;character&lt;/span&gt;(),bias&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;numeric&lt;/span&gt;(),variance&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;numeric&lt;/span&gt;())
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;### correct outcome model specification, logistic regression&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predicted_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_sample)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;n_sample) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(df, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;6000&lt;/span&gt;, replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;I&lt;/span&gt;(W2^2) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W4, data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; data, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;g_comp&lt;/span&gt;(model,data)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  predicted_ate[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ate
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bind_rows&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_correct_outcome&amp;#34;&lt;/span&gt;,bias&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bias&lt;/span&gt;(predicted_ate,true_ATE),variance&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;variance&lt;/span&gt;(predicted_ate)))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;### correct treatment model specification, logistic regression ipw&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predicted_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_sample)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;n_sample) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(df, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;6000&lt;/span&gt;, replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ps_model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;I&lt;/span&gt;(W2^2) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W4, data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;data, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#-0.5 + 0.8*W1 + 0.5*W2^2 + 0.3*W3 - 0.4*W1*W2 + 0.2*W4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ps &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ps_model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;fitted.values
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ps_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmax&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmin&lt;/span&gt;(ps, &lt;span style=&#34;color:#099&#34;&gt;0.95&lt;/span&gt;), &lt;span style=&#34;color:#099&#34;&gt;0.05&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  weights &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;ps_final, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;ps_final))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; A, data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;data, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;, weights &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; weights)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;g_comp&lt;/span&gt;(model,data)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  predicted_ate[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ate
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bind_rows&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_treatment_outcome&amp;#34;&lt;/span&gt;,bias&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bias&lt;/span&gt;(predicted_ate,true_ATE),variance&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;variance&lt;/span&gt;(predicted_ate)))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;### no interaction or quadratic relationship outcome model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predicted_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_sample)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;n_sample) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(df, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;6000&lt;/span&gt;, replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W4, data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; data, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;g_comp&lt;/span&gt;(model,data)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  predicted_ate[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ate
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bind_rows&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_wrong_outcome&amp;#34;&lt;/span&gt;,bias&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bias&lt;/span&gt;(predicted_ate,true_ATE),variance&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;variance&lt;/span&gt;(predicted_ate)))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;### xgboost&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(tidymodels)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(future)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plan&lt;/span&gt;(multisession, workers &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;6&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predicted_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_sample)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;n_sample) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(df, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;6000&lt;/span&gt;, replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(Y,A,W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W4) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(Y), A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(A))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; data &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# you really don&amp;#39;t need to do this, but i was lazy to change the oother ML codes&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_spec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;boost_tree&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  trees &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  tree_depth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  min_n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  loss_reduction &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mtry &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  learn_rate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set_engine&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgboost&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set_mode&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;classification&amp;#34;&lt;/span&gt;)  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Create workflow&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_wf &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;workflow&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_model&lt;/span&gt;(xgb_spec) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_formula&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; .)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Tuning grid&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;grid_space_filling&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;trees&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tree_depth&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;min_n&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;loss_reduction&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mtry&lt;/span&gt;(),train),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;learn_rate&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  size &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Cross-validation and tuning&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;folds &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vfold_cv&lt;/span&gt;(train, v &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_res &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune_grid&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  xgb_wf,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  resamples &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; folds,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; xgb_grid,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  control &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;control_grid&lt;/span&gt;(save_pred &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                         verbose &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Select best model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;best_xgb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select_best&lt;/span&gt;(xgb_res, metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;roc_auc&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Finalize and fit&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_xgb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize_workflow&lt;/span&gt;(xgb_wf, best_xgb)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_fit &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;fit&lt;/span&gt;(final_xgb, data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# g-comp&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;g_comp&lt;/span&gt;(final_fit, train, &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predict_ate[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ate
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bind_rows&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgboost_outcome_model&amp;#34;&lt;/span&gt;,bias&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bias&lt;/span&gt;(predicted_ate,true_ATE),variance&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;variance&lt;/span&gt;(predicted_ate)))
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; model &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; bias &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; variance &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; logreg_correct_outcome &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0009778 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001410 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; logreg_treatment_outcome &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0019444 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001581 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; logreg_wrong_outcome &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0464683 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001354 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgboost_outcome_model &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0164066 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0000459 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Wow, look at that! When outcome or treatment models were correctly specified, logistic regression is still the best with the lowest bias and low variance. When the outcome model was incorrectly specified, it became more biased and variance didn&amp;rsquo;t really change much. When we used xgboost for outcome model only, it&amp;rsquo;s less biased than misspecified logistic regression outcome model and interestingly, it has the lowest variance. Very interesting! This I think is helpful because it seems like tree based model is able to tease out quadratic and interaction relationship without us having to specify it. Now, what if we use xgboost models for both outcome and treatment models and then use TMLE framework to see if they are any better!&lt;/p&gt;




&lt;h4 id=&#34;using-tmle-procedures-with-xgboost&#34;&gt;Using TMLE procedures with Xgboost
  &lt;a href=&#34;#using-tmle-procedures-with-xgboost&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;p&gt;We&amp;rsquo;ll do the same as above by resampling 1000 times with n of 6000 with replacement. Then assess the bias and variance of the ATE. We will use xghboost model for both treatment and outcome models. Then use the TMLE procedure for the test as shown above. As for tuning, we will use grid search with space filling (&lt;code&gt;grid_space_filling&lt;/code&gt; with size of 5).&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;My messy code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(tidymodels)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(tidyverse)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(future)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plan&lt;/span&gt;(multisession, workers &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predicted_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_sample)}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in n_sample&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# sample&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(df, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;6000&lt;/span&gt;, replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(Y,A,W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W4) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(Y), A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(A))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; data
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# outcome model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_spec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;boost_tree&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  trees &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  tree_depth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  min_n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  loss_reduction &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# sample_size = tune(),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mtry &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  learn_rate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set_engine&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgboost&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set_mode&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;classification&amp;#34;&lt;/span&gt;)  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Create workflow&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_wf &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;workflow&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_model&lt;/span&gt;(xgb_spec) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_formula&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; .)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Tuning grid&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;grid_space_filling&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;trees&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tree_depth&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;min_n&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;loss_reduction&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mtry&lt;/span&gt;(),train),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;learn_rate&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  size &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Cross-validation and tuning&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;folds &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vfold_cv&lt;/span&gt;(train, v &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_res &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune_grid&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  xgb_wf,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  resamples &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; folds,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; xgb_grid,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  control &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;control_grid&lt;/span&gt;(save_pred &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                         &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# parallel_over = &amp;#34;resamples&amp;#34;, &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                         verbose &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Select best model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;best_xgb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select_best&lt;/span&gt;(xgb_res, metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;roc_auc&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Finalize and fit&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_xgb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize_workflow&lt;/span&gt;(xgb_wf, best_xgb)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_fit &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;fit&lt;/span&gt;(final_xgb, data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# predict&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train, type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# -------------------------------------------------------&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# treatment model &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;train_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;Y)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_wf_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;workflow&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_model&lt;/span&gt;(xgb_spec) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_formula&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; .)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_grid_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;grid_space_filling&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;trees&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tree_depth&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;min_n&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;loss_reduction&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mtry&lt;/span&gt;(),train_tx),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;learn_rate&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  size &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Cross-validation and tuning&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;folds_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vfold_cv&lt;/span&gt;(train_tx, v &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_res_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune_grid&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  xgb_wf_tx,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  resamples &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; folds_tx,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; xgb_grid_tx,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  control &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;control_grid&lt;/span&gt;(save_pred &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                         verbose &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Select best model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;best_xgb_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select_best&lt;/span&gt;(xgb_res_tx, metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;roc_auc&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Finalize and fit&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_xgb_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize_workflow&lt;/span&gt;(xgb_wf_tx, best_xgb)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_fit_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;fit&lt;/span&gt;(final_xgb_tx, data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train_tx)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit_tx, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;A), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ps
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;a_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit_tx, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train_tx, type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;())
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;a_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; (&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit_tx, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train_tx, type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;clever_covariate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(train_tx&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;ps_final, &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;ps_final))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# step 3 &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;epsilon_model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(train&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;offset&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; clever_covariate, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summary&lt;/span&gt;(epsilon_model)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;epsilon &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; epsilon_model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;coefficients
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#### Step 4. Update Initial Outcomes&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;updated_outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome_1)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;epsilon&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;a_1)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;updated_outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome_0)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;epsilon&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;a_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#### Step 5. Compute ATE&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(updated_outcome_1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;updated_outcome_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predicted_ate[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ate
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bind_rows&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgboost_tmle_model_size5&amp;#34;&lt;/span&gt;,bias&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bias&lt;/span&gt;(predicted_ate,true_ATE),variance&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;variance&lt;/span&gt;(predicted_ate)))
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; model &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; bias &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; variance &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; logreg_correct_outcome &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0009778 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001410 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; logreg_treatment_outcome &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0019444 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001581 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; logreg_wrong_outcome &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0464683 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001354 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgboost_outcome_model &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0164066 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0000459 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgboost_tmle_model_size5 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0066270 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001399 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Wow, how about that! Less biased than xgboost of outcome model only! Although the variance appear to be higher than pure xgboost outcome model, but it&amp;rsquo;s about the same as logistic regression. It does seem like TMLE can improve bias. Do you think we can do better? What if we increase the size to 20?&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;My messy code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predicted_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_sample)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;n_sample) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(df, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;6000&lt;/span&gt;, replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(Y,A,W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W4) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(Y), A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(A))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; data
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# train &amp;lt;- df |&amp;gt; select(Y,A,W1:W4) |&amp;gt; mutate(Y = as.factor(Y), A = as.factor(A))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# outcome model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_spec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;boost_tree&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  trees &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  tree_depth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  min_n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  loss_reduction &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# sample_size = tune(),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mtry &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  learn_rate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set_engine&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgboost&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set_mode&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;classification&amp;#34;&lt;/span&gt;)  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Create workflow&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_wf &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;workflow&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_model&lt;/span&gt;(xgb_spec) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_formula&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; .)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Tuning grid&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;grid_space_filling&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;trees&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tree_depth&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;min_n&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;loss_reduction&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mtry&lt;/span&gt;(),train),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;learn_rate&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  size &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;20&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Cross-validation and tuning&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;folds &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vfold_cv&lt;/span&gt;(train, v &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_res &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune_grid&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  xgb_wf,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  resamples &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; folds,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; xgb_grid,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  control &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;control_grid&lt;/span&gt;(save_pred &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                         &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# parallel_over = &amp;#34;resamples&amp;#34;, &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                         verbose &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Select best model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;best_xgb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select_best&lt;/span&gt;(xgb_res, metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;roc_auc&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Finalize and fit&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_xgb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize_workflow&lt;/span&gt;(xgb_wf, best_xgb)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_fit &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;fit&lt;/span&gt;(final_xgb, data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# predict&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train, type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# -------------------------------------------------------&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# treatment model &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;train_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;Y)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_wf_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;workflow&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_model&lt;/span&gt;(xgb_spec) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_formula&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; .)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_grid_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;grid_space_filling&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;trees&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tree_depth&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;min_n&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;loss_reduction&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mtry&lt;/span&gt;(),train_tx),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;learn_rate&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  size &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;20&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Cross-validation and tuning&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;folds_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vfold_cv&lt;/span&gt;(train_tx, v &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_res_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune_grid&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  xgb_wf_tx,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  resamples &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; folds_tx,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; xgb_grid_tx,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  control &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;control_grid&lt;/span&gt;(save_pred &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                         verbose &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Select best model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;best_xgb_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select_best&lt;/span&gt;(xgb_res_tx, metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;roc_auc&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Finalize and fit&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_xgb_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize_workflow&lt;/span&gt;(xgb_wf_tx, best_xgb)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_fit_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;fit&lt;/span&gt;(final_xgb_tx, data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train_tx)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit_tx, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;A), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ps
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# ps_final &amp;lt;- pmax(pmin(ps, 0.95), 0.05)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;a_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit_tx, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train_tx, type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;())
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;a_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; (&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit_tx, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train_tx, type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;clever_covariate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(train_tx&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;ps_final, &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;ps_final))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# step 3 &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;epsilon_model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(train&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;offset&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; clever_covariate, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summary&lt;/span&gt;(epsilon_model)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;epsilon &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; epsilon_model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;coefficients
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#### Step 4. Update Initial Outcomes&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;updated_outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome_1)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;epsilon&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;a_1)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;updated_outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome_0)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;epsilon&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;a_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#### Step 5. Compute ATE&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(updated_outcome_1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;updated_outcome_0)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; model &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; bias &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; variance &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; logreg_correct_outcome &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0009778 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001410 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; logreg_treatment_outcome &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0019444 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001581 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; logreg_wrong_outcome &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0464683 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001354 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgboost_outcome_model &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.0164066 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0000459 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgboost_tmle_model_size5 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0066270 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001399 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; xgboost_tmle_model_size20 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0041615 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0001645 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Look at that! Less biased than tmle size of 5, but you can see variance start to increase. This is really cool!&lt;/p&gt;




&lt;h2 id=&#34;traditional&#34;&gt;Is there a traditional statistical method for this?
  &lt;a href=&#34;#traditional&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Yes, apparently there is! Augmented IPW estimator 
&lt;a href=&#34;https://pmc.ncbi.nlm.nih.gov/articles/PMC8793316/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;here&lt;/a&gt; provides a doubly robust method for estimating causal effects that combines propensity score weighting with outcome modeling. AIPW remains consistent as long as either the propensity score model or the outcome model is correctly specified (sounds familiar? It&amp;rsquo;s like TMLE!), making it more reliable than traditional methods when model specification is uncertain. 
&lt;a href=&#34;https://cran.r-project.org/web/packages/AIPW/index.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Here is AIPW&lt;/a&gt; R package, if need to use this in the future.&lt;/p&gt;
&lt;p&gt;This doubly robust method also reminds me of Double Machine Learning (DML), we should compare all these methods in the future!&lt;/p&gt;




&lt;h2 id=&#34;tails&#34;&gt;But Does It Really Work?
  &lt;a href=&#34;#tails&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;After the blog was published on 11/16/25, Frank Harrell stated &amp;ldquo;No examination of TMLE is complete without critical examination of the non-coverage probabilities in both tails of confidence intervals. Prepare to be disappointed. Be sure to estimate non-coverage separately in the two tails. Some confidence interval procedures are accurate with overall coverage despite being wrong in both tails.&amp;rdquo; It&amp;rsquo;s true that we did not assess the 95% confidence interval with our resampling. Coming from a very experienced statistician, he is most likely right. Now, let&amp;rsquo;s examine the non-coverage tails! Let&amp;rsquo;s compare 1) logistic regression (correctly specificed outcome model), 2) logistic regression w IPW (correctly specified treatment model), 3) logistic regression (misspecified outcome model), 4) TMLE with xgboost (with grid search size of 5), 5) TMLE with xgboost (with grid search size of 20). Then, we&amp;rsquo;ll assess their 95% CIs and calculate what are the proportions of non-coverage in left and right tails across the 5 comparisons above and visualize them!&lt;/p&gt;
&lt;p&gt;For logistic regression models, we use bootstrap with 1000 resamples to estimate standard error. For TMLE models, we will use the estimated standard error from the TMLE procedure.&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;my messy code for logistic regression&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(dplyr)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(future)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(future.apply)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W4 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# TRUE propensity score model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-0.5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.8&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# TRUE outcome model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Calculate TRUE ATE&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;logit_Y1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;logit_Y0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y1_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(logit_Y1)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y0_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(logit_Y0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(Y1_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; Y0_true)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W1, W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W2, W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W3, W4 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W4, A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; A, Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; Y)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;g_comp &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(model,data,ml&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(ml&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    y1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model, new_data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    y0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model, new_data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  } else {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    y1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model, newdata&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;response&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    y0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(model, newdata&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;response&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(y1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;y0))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Set up parallel backend&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plan&lt;/span&gt;(multisession, workers &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10&lt;/span&gt;)  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# or plan(multisession)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n_sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predicted_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; predicted_lower &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; predicted_upper &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_sample)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n_boot &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;i_boot_seed &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;runif&lt;/span&gt;(n_boot, &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;, n_boot &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;)  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Better seed range&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n_boot_sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;6000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df_se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(method&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;character&lt;/span&gt;(),ate&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;numeric&lt;/span&gt;(),lower&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;numeric&lt;/span&gt;(),upper&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;numeric&lt;/span&gt;())
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;formula_list &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;formula_list[[1]] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.formula&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;I&lt;/span&gt;(W2^2) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;I&lt;/span&gt;(W4^2))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;formula_list[[2]] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.formula&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W4)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;formula_list[[3]] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.formula&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;I&lt;/span&gt;(W2^2) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;I&lt;/span&gt;(W4^2))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;model_function &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(formula,data,method) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(method&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_outcome_correct&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    model_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(formula &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; formula, data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; data, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(method&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_treatment_correct&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ps_model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(formula&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;formula,data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;data,family&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ps &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ps_model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;fitted.values
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ps_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmax&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pmin&lt;/span&gt;(ps, &lt;span style=&#34;color:#099&#34;&gt;0.95&lt;/span&gt;), &lt;span style=&#34;color:#099&#34;&gt;0.05&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  weights &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;ps_final, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;ps_final))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  model_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(Y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt;A,data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;data,family&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;,weights&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;weights)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(method&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_outcome_wrong&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    model_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(formula &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; formula, data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; data, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;is.null&lt;/span&gt;(method)) { &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;stop&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;need method&amp;#34;&lt;/span&gt;) }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;g_comp&lt;/span&gt;(model_final,data)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(ate)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Create the bootstrap function&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;bootstrap_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(j, data, formula, n_boot_sample, i_boot_seed,method) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(i_boot_seed[j])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  data_boot &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(data, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_boot_sample, replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ate_boot &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;model_function&lt;/span&gt;(formula,data_boot,method&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;method)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(ate_boot)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# method vector &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;method_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_outcome_correct&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_treatment_correct&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_outcome_wrong&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(method in method_vec) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(method&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_outcome_correct&amp;#34;&lt;/span&gt;) { formula &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; formula_list[[1]] }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(method&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_treatment_correct&amp;#34;&lt;/span&gt;) { formula &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; formula_list[[3]] }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(method&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_outcome_wrong&amp;#34;&lt;/span&gt;) { formula &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; formula_list[[2]] }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;n_sample) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(df, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_boot_sample, replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# model &amp;lt;- glm(formula = formula, data = data, family = &amp;#34;binomial&amp;#34;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;model_function&lt;/span&gt;(formula,data,method&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;method)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  predicted_ate[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ate
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ate_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;future_lapply&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;n_boot, bootstrap_ate, 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                           data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; data, 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                           formula &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; formula,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                           n_boot_sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_boot_sample,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                           i_boot_seed &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; i_boot_seed,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                           method &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; method,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                           future.globals &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;list&lt;/span&gt;(g_comp &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; g_comp, model_function&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;model_function),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                           future.seed &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;NULL&lt;/span&gt;,  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Use NULL since you&amp;#39;re doing manual seeding&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                           future.packages &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;dplyr&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Convert list to numeric vector&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ate_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unlist&lt;/span&gt;(ate_vec)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Calculate confidence intervals&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  predicted_lower[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;quantile&lt;/span&gt;(ate_vec, &lt;span style=&#34;color:#099&#34;&gt;0.025&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  predicted_upper[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;quantile&lt;/span&gt;(ate_vec, &lt;span style=&#34;color:#099&#34;&gt;0.975&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;Sample&amp;#34;&lt;/span&gt;, i, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ATE:&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(ate, &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;), 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;CI: [&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(predicted_lower[i], &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;), &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;,&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;round&lt;/span&gt;(predicted_upper[i], &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;), &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;]&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  df_se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df_se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbind&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(method &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; method, ate&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;predicted_ate[i], lower&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;predicted_lower[i],upper&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;predicted_upper[i]))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Clean up&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plan&lt;/span&gt;(sequential)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df_se2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df_se
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;save&lt;/span&gt;(df_se2, file &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_bootstrap_tails.rda&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;details&gt;
&lt;summary&gt;my messy code for tmle&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;### change size 5 vs 20&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(dplyr)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(tidymodels)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(future)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plan&lt;/span&gt;(multisession, workers &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;W4 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rnorm&lt;/span&gt;(n)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# TRUE propensity score model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-0.5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.8&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# TRUE outcome model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbinom&lt;/span&gt;(n, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Calculate TRUE ATE&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;logit_Y1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;logit_Y0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.6&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.4&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W2^2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.3&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0.2&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;W4^2
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y1_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(logit_Y1)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;Y0_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(logit_Y0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(Y1_true &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; Y0_true)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(W1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W1, W2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W2, W3 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W3, W4 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; W4, A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; A, Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; Y)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# sample&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n_sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n_i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;6000&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predicted_ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_sample)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;pred_se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vector&lt;/span&gt;(mode &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;numeric&amp;#34;&lt;/span&gt;, length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_sample)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;n_sample) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set.seed&lt;/span&gt;(i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(df, n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; n_i, replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(Y,A,W1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;W4) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(Y), A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(A))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; data
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# outcome model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_spec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;boost_tree&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  trees &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  tree_depth &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  min_n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  loss_reduction &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# sample_size = tune(),&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mtry &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  learn_rate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set_engine&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;xgboost&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;set_mode&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;classification&amp;#34;&lt;/span&gt;)  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Create workflow&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_wf &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;workflow&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_model&lt;/span&gt;(xgb_spec) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_formula&lt;/span&gt;(Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; .)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Tuning grid&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;grid_space_filling&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;trees&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tree_depth&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;min_n&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;loss_reduction&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mtry&lt;/span&gt;(),train),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;learn_rate&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  size &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt; &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# here &amp;lt;------------&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Cross-validation and tuning&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;folds &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vfold_cv&lt;/span&gt;(train, v &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_res &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune_grid&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  xgb_wf,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  resamples &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; folds,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; xgb_grid,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  control &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;control_grid&lt;/span&gt;(save_pred &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                         parallel_over &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;everything&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                         verbose &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Select best model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;best_xgb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select_best&lt;/span&gt;(xgb_res, metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;roc_auc&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Finalize and fit&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_xgb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize_workflow&lt;/span&gt;(xgb_wf, best_xgb)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_fit &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;fit&lt;/span&gt;(final_xgb, data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# predict&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train, type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.factor&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# -------------------------------------------------------&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# treatment model &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;train_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; train &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;Y)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_wf_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;workflow&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_model&lt;/span&gt;(xgb_spec) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%&amp;gt;%&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;add_formula&lt;/span&gt;(A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; .)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_grid_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;grid_space_filling&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;trees&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tree_depth&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;min_n&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;loss_reduction&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mtry&lt;/span&gt;(),train_tx),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;learn_rate&lt;/span&gt;(), 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  size &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt; &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# and here &amp;lt;-----------------&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Cross-validation and tuning&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;folds_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;vfold_cv&lt;/span&gt;(train_tx, v &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_res_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tune_grid&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  xgb_wf_tx,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  resamples &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; folds_tx,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  grid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; xgb_grid_tx,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  control &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;control_grid&lt;/span&gt;(save_pred &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                         verbose &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Select best model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;best_xgb_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select_best&lt;/span&gt;(xgb_res_tx, metric &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;roc_auc&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Finalize and fit&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_xgb_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;finalize_workflow&lt;/span&gt;(xgb_wf_tx, best_xgb)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;final_fit_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;fit&lt;/span&gt;(final_xgb_tx, data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train_tx)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit_tx, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train_tx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;A), type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ps_final &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ps
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# ps_final &amp;lt;- pmax(pmin(ps, 0.95), 0.05)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;a_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit_tx, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train_tx, type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;())
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;a_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; (&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;predict&lt;/span&gt;(final_fit_tx, new_data &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; train_tx, type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prob&amp;#34;&lt;/span&gt;)[,&lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;()))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;clever_covariate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(train_tx&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;ps_final, &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;ps_final))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# step 3 &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;epsilon_model &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;glm&lt;/span&gt;(train&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;Y &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;offset&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; clever_covariate, family &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;binomial&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summary&lt;/span&gt;(epsilon_model)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;epsilon &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; epsilon_model&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;coefficients
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#### Step 4. Update Initial Outcomes&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;updated_outcome_1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome_1)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;epsilon&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;a_1)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;updated_outcome_0 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;plogis&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;qlogis&lt;/span&gt;(outcome_0)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;epsilon&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;a_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#### Step 5. Compute ATE&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mean&lt;/span&gt;(updated_outcome_1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;updated_outcome_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;### step 6. SE&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;updated_outcome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ifelse&lt;/span&gt;(train&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;A &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;, updated_outcome_1, updated_outcome_0)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;se &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sqrt&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;var&lt;/span&gt;((&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;(train&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;Y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;updated_outcome)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;clever_covariate&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;updated_outcome_1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;updated_outcome_0&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;ate)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;n_i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;predicted_ate[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; ate
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;pred_se[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; se
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;i: &amp;#34;&lt;/span&gt;,i, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; ate: &amp;#34;&lt;/span&gt;, ate, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; se: &amp;#34;&lt;/span&gt;,se))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;((i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%%&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) { &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;save&lt;/span&gt;(predicted_ate, pred_se, file &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;predicted_ate_tmle_se_5.rda&amp;#34;&lt;/span&gt;) } &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# and make sure you change file name when saving too &amp;lt;--------- size 5 vs 20&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;details&gt;
&lt;summary&gt;visualization code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(tidyverse)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;load&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;logreg_bootstrap_tails.rda&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;load&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;predicted_ate_tmle_se_5.rda&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_tmle_5_se_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(ate&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;predicted_ate,se&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;pred_se)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;load&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;predicted_ate_tmle_se_20.rda&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;xgb_tmle_20_se_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(ate&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;predicted_ate,se&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;pred_se)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#### combine all&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;df_all_ci &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; xgb_tmle_20_se_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(lower_ci &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1.96&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;se,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         upper_ci &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1.96&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;se) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    method &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_xgboost_size20&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  ) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(method, ate, lower_ci, upper_ci) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbind&lt;/span&gt;(xgb_tmle_5_se_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(lower_ci &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1.96&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;se,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                 upper_ci &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; ate &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1.96&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;se) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            method &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;tmle_xgboost_size5&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          ) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(method,ate,lower_ci,upper_ci)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rbind&lt;/span&gt;(df_se2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;          &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rename&lt;/span&gt;(lower_ci&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;lower,upper_ci&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;upper)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(coverage &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    lower_ci &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;&lt;/span&gt; true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;left_miss&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    upper_ci &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;&lt;/span&gt; true_ATE &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;right_miss&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;within&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  )) 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;##### calculate prop&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;coverage_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df_all_ci &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;group_by&lt;/span&gt;(method,coverage) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summarize&lt;/span&gt;(prop &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;n&lt;/span&gt;()&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;*&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;100&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;/&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1000&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ungroup&lt;/span&gt;(coverage) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pivot_wider&lt;/span&gt;(id_cols &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;method&amp;#34;&lt;/span&gt;), names_from &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;coverage&amp;#34;&lt;/span&gt;, values_from &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;prop&amp;#34;&lt;/span&gt;, values_fill &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(stat &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;right missed: &amp;#34;&lt;/span&gt;, right_miss,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;%, covered: &amp;#34;&lt;/span&gt;,within,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;%, left missed: &amp;#34;&lt;/span&gt;,left_miss,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;%&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;coverage_order &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; coverage_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;arrange&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;desc&lt;/span&gt;(within)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(method)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;coverage_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; coverage_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(method &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;factor&lt;/span&gt;(method, levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; coverage_order))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;plot &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; df_all_ci &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;group_by&lt;/span&gt;(method) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;arrange&lt;/span&gt;(ate) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(row &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;row_number&lt;/span&gt;(),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;         method &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;factor&lt;/span&gt;(method, levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; coverage_order)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;ggplot&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;row,y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;ate,color&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;coverage)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_point&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_ribbon&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(ymin&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;lower_ci,ymax&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;upper_ci,fill&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;coverage),alpha&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_hline&lt;/span&gt;(yintercept &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; true_ATE, linewidth&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0.5&lt;/span&gt;, color&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;blue&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;geom_text&lt;/span&gt;(data&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;coverage_df, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;aes&lt;/span&gt;(x&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;500&lt;/span&gt;,y&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;-0.05&lt;/span&gt;,label&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;stat), inherit.aes &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme_bw&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;facet_wrap&lt;/span&gt;(.~method, ncol&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;theme&lt;/span&gt;(legend.position &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;bottom&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;xlab&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;bootstrap no.&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/tmle/index_files/figure-html/unnamed-chunk-31-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Wow, Frank was right to say &amp;ldquo;prepare to be disappointed!&amp;rdquo; 😅 The correctly specified logistic regression outcome model was the clear winner with 95.0% coverage and pretty symmetric tails (1.7% left miss, 3.3% right miss) - basically exactly what you&amp;rsquo;d want to see. The correctly specified treatment model with IPW did reasonably well at 92.4% coverage, though it had some asymmetry (7.0% left miss, 0.6% right miss). But wow, the misspecified logistic regression outcome model was a disaster - only 2.8% coverage with almost everything missing on the right tail (97.2% right miss)!&lt;/p&gt;
&lt;p&gt;Now here&amp;rsquo;s where it gets interesting: TMLE with XGBoost grid search size 5 achieved 87.6% coverage, which isn&amp;rsquo;t terrible, but it had this concerning asymmetry (11.3% left miss, 1.1% right miss). When I cranked up the grid search to size 20, thinking it would get better, it actually got worse - dropped to 71.1% coverage with misses on both sides (21.9% left miss, 7.0% right miss). So Frank&amp;rsquo;s warning was spot on! While TMLE definitely saved us from the catastrophic failure of misspecified parametric models, those confidence intervals just don&amp;rsquo;t behave properly. The machine learning component does great at capturing complex relationships without us having to specify all those weird interactions, but the price is wonky uncertainty estimates. Seems like TMLE&amp;rsquo;s real strength is getting better point estimates, not reliable confidence intervals. 🤔&lt;/p&gt;




&lt;h2 id=&#34;ack&#34;&gt;Acknowledgements
  &lt;a href=&#34;#ack&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Thanks to Frank Harrell for the pointers! We can see the non-coverage tails much clearer for all methods!&lt;/p&gt;




&lt;h2 id=&#34;opportunity&#34;&gt;Opportunities for improvement
  &lt;a href=&#34;#opportunity&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;learn to use &lt;code&gt;future_lapply&lt;/code&gt; with distributed computing&lt;/li&gt;
&lt;li&gt;test out different ML models for TMLE&lt;/li&gt;
&lt;li&gt;does it matter if we use all data (like above) or does ATE change with train/test split?&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;lessons&#34;&gt;Lessons learnt
  &lt;a href=&#34;#lessons&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;learnt to include anchor&lt;/li&gt;
&lt;li&gt;learnt from offset does in &lt;code&gt;glm&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;learnt TMLE&amp;rsquo;s procedure&lt;/li&gt;
&lt;li&gt;learnt to use &lt;code&gt;future&lt;/code&gt; parallel computing&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you like this article:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;please feel free to send me a 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;comment or visit my other blogs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;please feel free to follow me on 
&lt;a href=&#34;https://bsky.app/profile/kenkoonwong.bsky.social&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;BlueSky&lt;/a&gt;, 
&lt;a href=&#34;https://twitter.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;twitter&lt;/a&gt;, 
&lt;a href=&#34;https://github.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GitHub&lt;/a&gt; or 
&lt;a href=&#34;https://med-mastodon.com/@kenkoonwong&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Mastodon&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;if you would like collaborate please feel free to 
&lt;a href=&#34;https://www.kenkoonwong.com/contact/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;contact me&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>#IDWeek2025 Posts/Tweets Analysis</title>
      <link>https://www.kenkoonwong.com/blog/idweek25/</link>
      <pubDate>Sun, 26 Oct 2025 00:00:00 +0000</pubDate>
      
      <guid>https://www.kenkoonwong.com/blog/idweek25/</guid>
      <description>&lt;script src=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;blockquote&gt;
&lt;p&gt;IDWeek2025 had the lowest post count since 2022 (1,272 posts). While top posters shifted to Bluesky, X remained on par—the key difference was engagement rates favoring Bluesky. Grateful to learn so much from our SoMe community! 📊💙🙏&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src=&#34;idweek25_wordcloud.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Wordcloud of all the &lt;code&gt;IDweek2025&lt;/code&gt;, &lt;code&gt;idweek25&lt;/code&gt;  and &lt;code&gt;idweek&lt;/code&gt; tweets/posts.&lt;/p&gt;
&lt;p&gt;If you want to look at the specific tweets, I have created a shiny app that helps me to glance through essential topics. Here is the 
&lt;a href=&#34;https://kenkoonwong.shinyapps.io/idweek25/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;link&lt;/a&gt;&lt;/p&gt;




&lt;h1 id=&#34;thought-process&#34;&gt;Thought Process:
  &lt;a href=&#34;#thought-process&#34;&gt;&lt;/a&gt;
&lt;/h1&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#count&#34;&gt;Post counts by days&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#top50&#34;&gt;Top 50 Users Post Counts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#animate&#34;&gt;Posts frequency separated by dates&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#like&#34;&gt;Top 10 Liked Posts Seperated by Dates &lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#bookmark&#34;&gt;Top 20 Bookmarked Posts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#platform&#34;&gt;Bluesky vs X&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#ack&#34;&gt;Acknowledgement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#lesson&#34;&gt;Lessons learnt&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This year is a fun challenge because there are 2 platforms that ID community have been posting. Our old friend X/Twitter and now Bluesky. As we&amp;rsquo;ve monitored throughout the year, it does seem like there has been a divided community. Hence, #IDWeek is a great experiment to assess our ID community&amp;rsquo;s network and engagement. And we&amp;rsquo;ll be monitor both platform and see how many cool things people have been up to! As usual, we will list the top 50 users contributed to #IDweek posts, top liked posts, top bookmarked posts. Our goal here is to learn ID related information and also focus on the positives! ❤️&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Disclaimers: All these were obtained from Bluesky and X API. Pictures shown here are directly linked to the source. Except for the ones from Grok but I&amp;rsquo;ve linked to source&lt;/em&gt;&lt;/p&gt;




&lt;h2 id=&#34;count&#34;&gt;Post counts (Bluesky and X combined) by days
  &lt;a href=&#34;#count&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;672&#34; /&gt;
&lt;br&gt;
&lt;blockquote&gt;
&lt;p&gt;We had the least post and tweet counts combined this year when compared to 2024, 2023 and 2022!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Very interesting. When compared to IDweek 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/idweek2022/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;2022&lt;/a&gt; and 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/idweek2023/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;2023&lt;/a&gt;, and 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/idweek2024/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;2024&lt;/a&gt; and the post peaks appear to be similar to 2024. This year it also appeared to have the least posts when compared to 2022, 2023, and 2024. We had a total of &lt;code&gt;1272&lt;/code&gt; (it might actually be &lt;code&gt;1234&lt;/code&gt; 
&lt;a href=&#34;#duplicate&#34;&gt;see this&lt;/a&gt;) posts this year, whereas in 2024, we had &lt;code&gt;1674&lt;/code&gt; posts, in 2023 we had &lt;code&gt;2627&lt;/code&gt; posts, and in 2022 it was &lt;code&gt;~2188&lt;/code&gt;. It is clear that ID community and SoMe has decreased engagement this year.&lt;/p&gt;
&lt;br&gt;




&lt;h2 id=&#34;top50&#34;&gt;Top 50 Users (Both Bluesky and X) Post Counts
  &lt;a href=&#34;#top50&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/figure-html/unnamed-chunk-3-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Thanks to all who have contributed &lt;code&gt;sairabt.bsky.social&lt;/code&gt;, &lt;code&gt;LordAlirezaF&lt;/code&gt;, &lt;code&gt;sidpharm.bsky.social&lt;/code&gt;, &lt;code&gt;josephmarcusid.medsky.social&lt;/code&gt;, &lt;code&gt;uwidfellowship.bsky.social&lt;/code&gt;, &lt;code&gt;IDWeekmtg&lt;/code&gt;, &lt;code&gt;idweek.bsky.social&lt;/code&gt;, &lt;code&gt;iuidfellowship.bsky.social&lt;/code&gt;, &lt;code&gt;InfectDiseaseAd&lt;/code&gt;, &lt;code&gt;WebsEdge_Med&lt;/code&gt;, &lt;code&gt;boghuma.bsky.social&lt;/code&gt;, &lt;code&gt;patrickching.bsky.social&lt;/code&gt;, &lt;code&gt;jschaenmanmd.bsky.social&lt;/code&gt;, &lt;code&gt;pRxcisionAI&lt;/code&gt;, &lt;code&gt;CUP_med_health&lt;/code&gt;, &lt;code&gt;ccf_idfellows&lt;/code&gt;, &lt;code&gt;websedgemedicine.bsky.social&lt;/code&gt;, &lt;code&gt;Contagion_Live&lt;/code&gt;, &lt;code&gt;contagionlive.bsky.social&lt;/code&gt;, &lt;code&gt;HMS_MI&lt;/code&gt; leading the top 20 posts combined. Can you find your handle here?&lt;/p&gt;
&lt;p&gt;This year is also very interesting to see that the top posters comprise of more bluesky users than X users. But you do see some familiar usernames who traditionally have been top posters on X such as &lt;code&gt;LordAlirezaF&lt;/code&gt;. An easy way to know whether it&amp;rsquo;s Bluesky or X is that Bluesky username usually has a domain such as &lt;code&gt;bsky.social&lt;/code&gt; or &lt;code&gt;medsky.social&lt;/code&gt; etc.&lt;/p&gt;
&lt;br&gt;




&lt;h2 id=&#34;animate&#34;&gt;Posts frequency separated by dates
  &lt;a href=&#34;#animate&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;&lt;img src=&#34;post_anim.gif&#34; alt=&#34;&#34;&gt;&lt;/p&gt;




&lt;h2 id=&#34;like&#34;&gt;Top 10 Liked Posts Seperated by Dates
  &lt;a href=&#34;#like&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;




&lt;h4 id=&#34;10-18-25&#34;&gt;10-18-25
  &lt;a href=&#34;#10-18-25&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; date &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; username &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; text &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; like_count &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; mmpharmd.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; As per tradition, the bracelets were counted this morning in the hotel room with @ebhirsch.bsky.social .  Last year we came in at 169.  Any guesses how many were made this year??  #IDWeek2025 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 14 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; dralicehan.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; #IDsky
Just arrived in Atlanta! Excited to see everyone at #IDweek. I’ll discuss any questions or advice for fellows interested in private practice in ID. Hope to see you tomorrow at Fellow’s Day! &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 12 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; TakaMatsuo_ID &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Arrived in Atlanta!!
&lt;p&gt;Looking forward to connecting with everyone at #IDWeek2025!&lt;/p&gt;
&lt;p&gt;@IDWeekmtg @MDAndersonNews @MayoClinicINFD @UT_Infectious @BCMIDFellowship 
&lt;a href=&#34;https://t.co/t75aEsrlM2&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://t.co/t75aEsrlM2&lt;/a&gt; &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 12 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; darcy-id-doc.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Great Program Director’s Meeting at #IDWeek2025 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 11 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; brianchowmd.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Welcome to #IDWeek2025 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 10 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; pascalisid.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Please join us this Monday 8-9 am at IDWeek. Difficult-to-treat mold infections. Dimitrios Farmakiotis will be discussing treatment of mucormycosis. I will be discussing aspergillosis. &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 9 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; ebhirsch.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Absolutely phenomenal @sidpharm.bsky.social keynote 🔑 from Dr. Mati Hlatshwayo Davis @matih-id.bsky.social today: Reducing Barriers Toward Health Equity: Advocating for Change in the Healthcare Industry and Beyond. 
&lt;p&gt;I laughed, I cried. All the things. Just wow 👏👏👏 #IDWeek2025 #IDSky &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 9 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; mmpharmd.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Look what I found! @idsainfo.bsky.social @idweek.bsky.social @idweekoutbreak.bsky.social @sidpharm.bsky.social #IDWeek2025 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 9 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; josephmarcusid.medsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; See you tomorrow!
&lt;p&gt;#IDSky #IDWeek2025 &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 7 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; sidpharm.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Who&#39;s excited for the Annual Meeting today?! If you&#39;re in ATL for IDweek, and unable to attend SIDP&#39;s Annual meeting, check out these SIDP members pre-meeting workshops! #sidp #idweek2025 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 6 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;This is the pre-meeting day. Looks like everyone is super pumped for the conference! Not to mention Program Director Day pictures shared by &lt;code&gt;darcy-id-doc.bsky.social&lt;/code&gt; 
&lt;a href=&#34;https://bsky.app/profile/darcy-id-doc.bsky.social/post/3m3j5vgza5c2i&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;source&lt;/a&gt;. Very important leadership meeting for our next generation ID docs!&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:qpnpyygbi5uupliavxp2cc6x/bafkreibctfv4d6px3slikcgkd4klr7gnawb5hm7rw72bsh3oxbxnuifv2e@jpeg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;




&lt;h4 id=&#34;10-19-25&#34;&gt;10-19-25
  &lt;a href=&#34;#10-19-25&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; date &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; username &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; text &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; like_count &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; uwidfellowship.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Prolonged applause for Dr Demetri Daskalakis, formerly of the CDC, winner of the HIV Medical Association Transformative Leader Award… #IDSky #IDWeek #IDStrong &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 118 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; boghuma.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Mandatory meeting selfie ##IDweek2025 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 58 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; patrickching.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; IDWeek2025 Social Media Meetup
@jonathanrydermd.bsky.social @josephmarcusid.medsky.social
#IDWeek2025 #IDSky &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 48 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; boghuma.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Who is in Atlanta for  #IDweek2025? 
If you see me at the convention center please say hi ✋. It&#39;s always better to see/meet people IRL.
#IDSky &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 36 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; jonathanrydermd.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Let’s go!!! Hope to meet many of you at tonight’s social media meet-up! #IDWeek2025 @idweek.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 34 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; jschaenmanmd.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Huge turn out for #IDWeek2025 @idweek.bsky.social Plenary Session with Javier Muñoz ! 🤩 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 24 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; davidvanduin.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Dr Tamma mentions the ongoing GOAT trial (PI Tamma &amp;amp; Cosgrove)
This will be a most needed and important study. clinicaltrials.gov/study/NCT060...
#IDweek2025 #IDsky &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 19 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; docdelrio.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; #IDWeek2025 kicks off today, &amp;amp; @emorydeptofmed.bsky.social  is well represented with a total of 96 Emory-affiliated presentation. Don&#39;t miss talks by @colleenkelly.bsky.social &amp;amp; @gradydoctor.bsky.social   Full list of Emory sessions: bit.ly/EmoryIDWeek2025 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 19 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; umn-idim.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; We’re heading to #IDWeek2025!
&lt;p&gt;Along with our colleagues at the Minnesota Department of Health and across the Twin Cities, we will be sharing:&lt;/p&gt;
&lt;p&gt;📰 26 posters
🎤 12 symposiums
🗣️ 4 oral abstracts
🏆 1 plenary&lt;/p&gt;
&lt;p&gt;We’re excited to showcase the incredible work coming out of Minnesota. See you there! #IDSky &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 19 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; davidzd.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; This is what leadership looks like. 
#VaccinesSaveLives
#IDWeek2025
#IDSky &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 17 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;On the top, contributed by &lt;code&gt;uwidfellowship.bsky.social&lt;/code&gt;, prolonged applause for Dr Demetri Daskalakis, winner of the HIV Medical Association Transformative Leader Award. 👏 
&lt;a href=&#34;https://bsky.app/profile/uwidfellowship.bsky.social/post/3m3l5ty52k225&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;source&lt;/a&gt;
&lt;img src=&#34;https://video.bsky.app/watch/did%3Aplc%3Aymseunejfv6wwo24omcnpsxc/bafkreic4oc3cftjod6srtne7aidtn2ro6btf6hn7a47vtnbgm2thpan7nm/thumbnail.jpg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;br&gt;
&lt;p&gt;The first day is always a special day and mandatory selfie day! Here we have &lt;code&gt;boghuma.bsky.social&lt;/code&gt; selfie post with second highest likes! 
&lt;a href=&#34;https://bsky.app/profile/boghuma.bsky.social/post/3m3kwq2ox5c2o&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;source&lt;/a&gt;
&lt;img src=&#34;https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:skpxj6dpacrxznfvwrpfdiw4/bafkreifsmhbbxlstsw4iz5vzavasq2kl64ahg4t63rxtbr4gavtpwizc44@jpeg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;br&gt;
&lt;p&gt;This day is a special day for SoMe! Thanks to &lt;code&gt;patrickching.bsky.social&lt;/code&gt; for sharing the pictures on his post about 
&lt;a href=&#34;https://bsky.app/profile/patrickching.bsky.social/post/3m3lipvpwgs2j&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;IDWeek2025 Social Media Event&lt;/a&gt;, amplified by &lt;code&gt;jonathanrydermd.bsky.social&lt;/code&gt;&amp;rsquo;s engagement!&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:yjb2nsyihm4v4r324fpc25du/bafkreiewrbo2gcnngmk544fksrpmknqjtmc2i75ahel7svd5sspzu7rgq4@jpeg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:tufqamqzkky3ed4nur2znn5r/bafkreie5aiute6fcnmjl5g4t3bw3cs7tvyfd3lnmognelg5oawmmnroehe@jpeg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;br&gt;
&lt;p&gt;&lt;code&gt;davidvanduin.bsky.social&lt;/code&gt; highlighted the importance of 
&lt;a href=&#34;https://clinicaltrials.gov/study/NCT06080698&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GOAT Trial&lt;/a&gt;. Gram Negative BSI Oral Antibiotic Therapy Trial assessing IV vs early oral abx transition w RCT on 11 US sites. Trial ends ~July 2027.
&lt;img src=&#34;https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:iqh3klzgjyqeqnocehrw4ryd/bafkreiboktbqoahlim6etfalommfn5h3ohlpu5e233nclgudsd2igkua2e@jpeg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;




&lt;h4 id=&#34;10-20-25&#34;&gt;10-20-25
  &lt;a href=&#34;#10-20-25&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; date &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; username &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; text &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; like_count &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; boghuma.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Easily my favorite session at #IDWEEK #IDSky &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 60 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; DrIanWeissman &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; CDC researchers are being forced to skip a pivotal conference on infectious disease this week due to the government shutdown, missing out on high-level discussions not long after surges in measles and whooping cough hit the U.S.
https://t.co/mij0HnHywI &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 22 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; boghuma.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Afternoon delight.
Challenging cases HIV and STI cases.
First up a serological standoff
RPR 1:4--&amp;gt; 1:8, patient with ocular and otic symptoms.  Does not meet the 4 -fold increase threshold for new infection. So do you treat for syphilis now or not?
#IDweek #IDSky &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 21 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; hectorjoselora &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Recap 10/20 of @IDWeekmtg @IDSAInfo #IDWEEK2025
&lt;p&gt;-Proudest day ever! poster (1/2) led by Isabel Cintron  and medical students. First time advising/mentoring a project from medical students. #MedEd&lt;/p&gt;
&lt;p&gt;-With the ID Docs😎🧫🦠
@mobrito05 @PaulinoRobert  Dr. Bisonó 
&lt;a href=&#34;https://t.co/f9LzQpn5hn&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://t.co/f9LzQpn5hn&lt;/a&gt; &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 17 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; courtharrismd.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; #IDWeek2025 off to a 🔥start! first in person Women in ID Book Club reading the Rainfall Market! 
&lt;p&gt;Thankful for these women who continue to come out to support and enjoy each others company! @idweek.bsky.social @jennmpd.bsky.social @sairabt.bsky.social @docwoc71.bsky.social @dralicehan.bsky.social &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 14 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; drrossanarosa.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Happy International Infection Prevention Week!
#IDSky
#IIPW
#IDWeek2025 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 12 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; rachelalter007.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; It&#39;s pretty freaking awesome that the same people the keynote speakers is thanking for being her mentors may be mine too in a few short months 🤞🏻🤞🏻🤞🏻
&lt;p&gt;#idweek2025 &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 12 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; cidrap.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; C difficile deaths more common among Whites, people in large urban areas
&lt;p&gt;White Americans accounted for 84% of the 216,311 C difficile-associated deaths reported from 1999 through 2023, researchers reported at IDWeek 2025.&lt;/p&gt;
&lt;p&gt;
&lt;a href=&#34;https://www.cidrap.umn.edu/c&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;www.cidrap.umn.edu/c&lt;/a&gt;&amp;hellip; &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 11 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; boghuma.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; DoxyPEP can alter serological titers for RPR with slower rise in titers. Since suspicion of otic and ocular syphilis high. Treatment is started.
#idweek  #idsky &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 11 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; idweek.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Vaccination against herpes zoster, or shingles, is linked to lower risks of heart disease, dementia and death in people age 50 and older, according to new research presented at #IDWeek2025. https://bit.ly/471biNp &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 10 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Challenging cases! Where we learn from others&amp;rsquo; experiences.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:skpxj6dpacrxznfvwrpfdiw4/bafkreigcf6jbmsuclsdhmhtbakl7t42eulrnfp3yogiiot4erp7hje7rum@jpeg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Here &lt;code&gt;boghuma.bsky.social&lt;/code&gt; shared first case of progressive pneumonia with bcx and resp cx growing Bacillus cereus, turned out to be Welder&amp;rsquo;s Antrhax, an emerging entity that has caused 9 cases in Louisiana. Second case was AML w severe flank pain, with mold &lt;code&gt;Lichtheimia corymbifera&lt;/code&gt; !?! Wow.&lt;/p&gt;
&lt;p&gt;Followed by another interesting post by &lt;code&gt;boghuma.bsky.social&lt;/code&gt; with DoxyPEP patient with RPR 1:4&amp;ndash;&amp;gt; 1:8, with ocular and otic symptoms. Treat or not treat? Here the presentation showcased how DoxyPEP may slow the rise of RPR titers in the setting of active infection. Very interesting!
&lt;img src=&#34;https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:skpxj6dpacrxznfvwrpfdiw4/bafkreibkzrjrz5c55c2qd2plh4mmkkhbnzkfcgiqp7mg42s7qfh7j3wqqq@jpeg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Interesting thing on Cdiff is that a review of data from the US Centers for Disease Control and Prevention (CDC), a team led by researchers from AdventHealth Sebring in Florida found that White people accounted for 83.9% of the 216,311 reported US deaths associated with CDI from 1999 through 2023, while Black Americans accounted from 8.1% and Hispanic people 5.5%. 
&lt;a href=&#34;https://www.cidrap.umn.edu/clostridium-difficile/c-difficile-deaths-more-common-among-whites-people-large-urban-areas&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;A new study of over 174,000 adults found that shingles vaccination significantly reduced health risks in people 50 and older, including a 50% lower risk of vascular dementia, 27% lower risk of blood clots, 25% lower risk of heart attack or stroke, and 21% lower risk of death. The findings suggest the shingles vaccine provides protection beyond preventing shingles itself by reducing cardiovascular and neurologic complications that can be triggered by shingles infection.
&lt;a href=&#34;https://www.idsociety.org/news--publications-new/articles/2025/shingles-vaccine-lowers-risk-of-dementia-major-cardiovascular-events/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;source&lt;/a&gt;. Following this information, some people on X had asked Grok to verify the information and here is the response. 
&lt;a href=&#34;https://x.com/grok/status/1980873515034374651&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;source1&lt;/a&gt;, 
&lt;a href=&#34;https://x.com/grok/status/1981112185373020665&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;source2&lt;/a&gt;&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;grok1.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;grok2.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;




&lt;h4 id=&#34;10-21-25&#34;&gt;10-21-25
  &lt;a href=&#34;#10-21-25&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; date &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; username &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; text &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; like_count &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; boghuma.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; The absence of our CDC and NIH colleagues is profoundly felt at this year&#39;s #IDweek2025 meeting happening in Atlanta the city where the agency is head-quartered. 
www.medpagetoday.com/infectiousdi... &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 84 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; cidrap.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Drugmakers announce encouraging new antibiotic data at IDWeek
&lt;p&gt;Shionogi presents real-world data on cefiderocol&amp;rsquo;s effectiveness against tough-to-treat gram-negative infections, while GSK releases promising phase 3 data for tebipenem HBr.&lt;/p&gt;
&lt;p&gt;
&lt;a href=&#34;https://www.cidrap.umn.edu/a&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;www.cidrap.umn.edu/a&lt;/a&gt;&amp;hellip; &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 16 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; OslerResidency &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Much #Oslerpride at IDWeek! What a stellar group 😍💪🏽👏🏽 https://t.co/82ZJozdkro &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 13 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; michaelisonmd.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Due to the US Government Shutdown, I will not be able to atted @IDWeek.bsky.social #IDWeek2025.  I had worked on my 2 invited talks and so will share them.
&lt;p&gt;My first talk was going to be about the epidemiology and clinical course of respiratory viral infections in the immunocompromised. &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 12 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; takamatsuo-id.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Ready for 2 oral presentations today! (same session)
&lt;p&gt;Session Title: Bad to the Bone: News From BJI
Session Date: Tuesday October 21, 2025
Session Time: 10:30 AM - 11:45 AM 
Location: B401-402&lt;/p&gt;
&lt;p&gt;#IDWeek2025 #MayoClinicINFD @idweek.bsky.social #OrthoID #SepticArthritis #NVO &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 10 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; jonathanrydermd.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; @nkusnikmd.bsky.social presenting our blood culture shortage experience during an oral abstract session appropriately titled “A Bloody Mess!” @idweek.bsky.social #IDWeek2025 @unmc-id.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 10 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; jeremytigh.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Increasing NDM across the U.S.😭  #IDWeek2025 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 9 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; healioid.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; There is more evidence that #shingles vaccination significantly decreases the risk for heart disease, dementia and death among people aged 50 years or older with herpes zoster infection, according to data presented at #IDWeek2025.
&lt;p&gt;💉 #IDSky&lt;/p&gt;
&lt;p&gt;🥼 #MedSky&lt;/p&gt;
&lt;p&gt;
&lt;a href=&#34;https://www.healio.com/news/infecti&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;www.healio.com/news/infecti&lt;/a&gt;&amp;hellip; &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 9 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; dricks.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; CDC experts—what&#39;s left of them—now forced to skip a pivotal infectious diseases meeting because of the gov&#39;t shutdown.
Their conference appearances have been postponed unless funded outside of government budgets. IDWeek is the largest meeting of infectious disease experts in the US👇 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 9 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; jonathanrydermd.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; @takamatsuo-id.bsky.social presenting back-to-back oral abstracts on bone and joint infections @idweek.bsky.social! #IDWeek2025 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 9 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Drugmakers announce encouraging new antibiotic data. &lt;code&gt;Cefiderocol (Fetroja)&lt;/code&gt; achieved a 70.1% clinical cure rate in over 500 US patients with serious gram-negative bacterial infections, with higher success rates when used empirically (73.7%) rather than as salvage therapy (54.3%). Additionally, GSK announced that their experimental oral antibiotic &lt;code&gt;tebipenem HBr&lt;/code&gt; was non-inferior to IV antibiotics for complicated urinary tract infections in a phase 3 trial, potentially becoming the first oral carbapenem available in the US for these infections. 
&lt;a href=&#34;https://www.cidrap.umn.edu/antimicrobial-stewardship/drugmakers-announce-encouraging-new-antibiotic-data-idweek&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;source&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;NDM on the RISE! 461% increase!!!&lt;br&gt;
&lt;img src=&#34;https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:o5d2vc55ugpy7ios6k2ydzeu/bafkreif2f6thg5gybhirqhgo56sih23vsxx6tyhd3bpggzpxeymc2j6vam@jpeg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;And more BoJo oral abstracts by &lt;code&gt;takamatsuo-id.bsky.social&lt;/code&gt; [more info]. Can&amp;rsquo;t wait to watch it online. (
&lt;a href=&#34;https://bsky.app/profile/takamatsuo-id.bsky.social/post/3m3pd2blzoc2k&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://bsky.app/profile/takamatsuo-id.bsky.social/post/3m3pd2blzoc2k&lt;/a&gt;)&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;https://cdn.bsky.app/img/feed_thumbnail/plain/did:plc:yjb2nsyihm4v4r324fpc25du/bafkreifx7jiyjyulp3ojazhqxmma6xu3klrfwofn5u4t2obrx4s4zwu5yy@jpeg&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;




&lt;h4 id=&#34;10-22-25&#34;&gt;10-22-25
  &lt;a href=&#34;#10-22-25&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; date &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; username &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; text &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; like_count &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; boghuma.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; This year&#39;s #IDweek is now over. Our community has taken so many hits since the pandemic and now the assaults on public health, science, vaccines etc.
Yet the infectious disease community has some of the most dedicated, resilient and thoughtful people I know.
Here&#39;s to hope &amp;amp; staying with the fight. &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 80 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; joffirphd.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; This is good news.
&lt;p&gt;&amp;ldquo;The New England Journal of Medicine and the Center for Infectious Disease Research and Policy will begin publishing “public health alerts” in the coming month, CIDRAP Director Michael Osterholm announced at the IDWeek conference on Sunday.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;
&lt;a href=&#34;https://www.statnews.com/2025/10/21/c&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;www.statnews.com/2025/10/21/c&lt;/a&gt;&amp;hellip; &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 41 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; sairabt.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; #idweek2025 new in #UTI IDSA (GRADE methodology: clinical cure critical based on pt feedback) vs wikiguidelines (clear recommendation vs clinical review) 1/ &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 24 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; docwoc71.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Getting ready for #IDBugBowl @idweek.bsky.social  Hope to see y’alll this afternoon.  Come and cheer ur team UAB CCF ThomasJefferson and our international team Calcutta Troo Med #IDWeek2025 #IDSky &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 12 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; kemarbarrettmd &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; First #IDWeek2025 was great! Happy to share our work with the broader 🆔 community! Always a great time with the @MayoClinicINFD family. See you next year in DC! 🥳 https://t.co/Q0zGVu6NIF &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 12 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; democratswin.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; The New England Journal of Medicine and the Center for Infectious Disease Research and Policy will begin publishing “public health alerts” in the coming month, CIDRAP Director Michael Osterholm announced at the IDWeek conference on Sunday. 
&lt;p&gt;
&lt;a href=&#34;https://www.statnews.com/2025/10/21/c&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;www.statnews.com/2025/10/21/c&lt;/a&gt;&amp;hellip; &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 10 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Medscape &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; It’s possible to produce a generic version of lenacapavir, a twice-yearly antiretroviral injection that both prevents and treats HIV, for as little as $25 per patient per year, according to a study presented at the Infectious Disease Week (IDWeek) 2025 Annual Meeting in Atlanta. https://t.co/lUTq58OijU &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 10 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; WebsEdge_Med &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; &#34;They&#39;re waging psychological warfare on our young scientists&#34;. At IDWeek 2025, we sat down with @PeterHotez to discuss the state of vaccines in the US, and what the future holds for experts in infectious disease.  
&lt;p&gt;Watch the full conversation on WebsEdge Medicine YouTube, out 
&lt;a href=&#34;https://t.co/7y0pRBI8YB&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://t.co/7y0pRBI8YB&lt;/a&gt; &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 10 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; idweek.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Vaccines cause adults. #IDWeek2025 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 8 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; docwoc71.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; About last night: on the 3rd day of #IDWeek2025 we had a jam
Pack nightly activities from #TID get together to the cultural dance party @idweek.bsky.social @idsainfo.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 8 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;It&amp;rsquo;s BUG BOWL TIME! Thanks to &lt;code&gt;docwoc71.bsky.social&lt;/code&gt; for highlighting it!
&lt;img src=&#34;https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:inyp24pv3zpnhxu7jrf6xfa4/bafkreifc3e7sarrhwcqztulmdwyvkrgtl7orwls7jh7rm3zo5mttd7wr3m@jpeg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;sairabt.bsky.social&lt;/code&gt; highlighted the #UTI IDSA (GRADE methodology: clinical cure critical based on pt feedback) vs wikiguidelines (clear recommendation vs clinical review) 
&lt;a href=&#34;https://bsky.app/profile/sairabt.bsky.social/post/3m3rt5kie3s2w&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;see more&lt;/a&gt;
&lt;img src=&#34;https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:w3qy7jvbl6l3h5iyvna7u7wm/bafkreigzvrqram4jjpbt5zwg5jdfzeffngpiqrxj2xrouifqhghxcci53u@jpeg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Contributed by &lt;code&gt;Medscape&lt;/code&gt; on X. It’s possible to produce a generic version of lenacapavir, a twice-yearly antiretroviral injection that both prevents and treats HIV, for as little as $25 per patient per year!&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;https://pbs.twimg.com/media/G34WIHnW0AEhDCQ?format=jpg&amp;name=large&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;




&lt;h3 id=&#34;bookmark&#34;&gt;Top 20 Bookmarked Posts
  &lt;a href=&#34;#bookmark&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; date &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; username &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; text &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; bookmark_count &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; boghuma.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Easily my favorite session at #IDWEEK #IDSky &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 3 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; boghuma.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Afternoon delight.
Challenging cases HIV and STI cases.
First up a serological standoff
RPR 1:4--&amp;gt; 1:8, patient with ocular and otic symptoms.  Does not meet the 4 -fold increase threshold for new infection. So do you treat for syphilis now or not?
#IDweek #IDSky &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 3 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; michaelisonmd.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Due to the US Government Shutdown, I will not be able to atted @IDWeek.bsky.social #IDWeek2025.  I had worked on my 2 invited talks and so will share them.
&lt;p&gt;My first talk was going to be about the epidemiology and clinical course of respiratory viral infections in the immunocompromised. &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 3 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; idstewardship.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 🤩 Great talks in the APP bootcamp session at #IDweek2025 
&lt;p&gt;Back to basics sure, but contextualized by high quality presenters = 🙌 &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 3 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; WebsEdge_Med &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; &#34;They&#39;re waging psychological warfare on our young scientists&#34;. At IDWeek 2025, we sat down with @PeterHotez to discuss the state of vaccines in the US, and what the future holds for experts in infectious disease.  
&lt;p&gt;Watch the full conversation on WebsEdge Medicine YouTube, out 
&lt;a href=&#34;https://t.co/7y0pRBI8YB&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://t.co/7y0pRBI8YB&lt;/a&gt; &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 3 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; patrickching.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; IDWeek2025 Social Media Meetup
@jonathanrydermd.bsky.social @josephmarcusid.medsky.social
#IDWeek2025 #IDSky &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; uwidfellowship.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Prolonged applause for Dr Demetri Daskalakis, formerly of the CDC, winner of the HIV Medical Association Transformative Leader Award… #IDSky #IDWeek #IDStrong &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; sairabt.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; #idweek2025 new in #UTI IDSA (GRADE methodology: clinical cure critical based on pt feedback) vs wikiguidelines (clear recommendation vs clinical review) 1/ &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; idweek.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Vaccination against herpes zoster, or shingles, is linked to lower risks of heart disease, dementia and death in people age 50 and older, according to new research presented at #IDWeek2025. https://bit.ly/471biNp &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; joffirphd.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; This is good news.
&lt;p&gt;&amp;ldquo;The New England Journal of Medicine and the Center for Infectious Disease Research and Policy will begin publishing “public health alerts” in the coming month, CIDRAP Director Michael Osterholm announced at the IDWeek conference on Sunday.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;
&lt;a href=&#34;https://www.statnews.com/2025/10/21/c&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;www.statnews.com/2025/10/21/c&lt;/a&gt;&amp;hellip; &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; sairabt.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; #idweek2025 #idsky what’s hot in ID clinical science Dr Wright #BSI #BALANCEtrial &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; patrickching.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Big Beasts of #ClinicalMycology #IDWeek2025
@idweek.bsky.social #IDSky
Rare mold infections &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Medscape &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; It’s possible to produce a generic version of lenacapavir, a twice-yearly antiretroviral injection that both prevents and treats HIV, for as little as $25 per patient per year, according to a study presented at the Infectious Disease Week (IDWeek) 2025 Annual Meeting in Atlanta. https://t.co/lUTq58OijU &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; ABsteward &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; @AlastairMcA30 @BradSpellberg @DrToddLee The tyranny of experts!
I love @BradSpellberg Ed commentary 
Opinion-Based Recommendations: Beware the Tyranny of Experts #IDWeek2025
https://t.co/k7rNeoKVN8 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; ABsteward &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Any late-breaking abstracts or practice-changing RCTs being presented at #IDWeek2025 — thinking SNAP, MERINO, OVIVA, POET, CAMERA2 etc? Please tag presenters/teams if you know them #IDWeek2025 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; AnciraBecky &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Shingles Vaccination Tied to Drops in Cardiovascular Events, Dementia, Death https://t.co/A6TdjI9k0i &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 2 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; cidrap.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; C difficile deaths more common among Whites, people in large urban areas
&lt;p&gt;White Americans accounted for 84% of the 216,311 C difficile-associated deaths reported from 1999 through 2023, researchers reported at IDWeek 2025.&lt;/p&gt;
&lt;p&gt;
&lt;a href=&#34;https://www.cidrap.umn.edu/c&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;www.cidrap.umn.edu/c&lt;/a&gt;&amp;hellip; &lt;/td&gt;&lt;/p&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 1 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; davidvanduin.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Dr Tamma mentions the ongoing GOAT trial (PI Tamma &amp;amp; Cosgrove)
This will be a most needed and important study. clinicaltrials.gov/study/NCT060...
#IDweek2025 #IDsky &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 1 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; jeremytigh.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; Increasing NDM across the U.S.😭  #IDWeek2025 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 1 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; uwidfellowship.bsky.social &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; New from IDSA, thanks to Dr William Werbel, resp virus vax in immunocompromised population:
&lt;ul&gt;
&lt;li&gt;vax 2 weeks before or 3-6 months after immunocompromise expected&lt;/li&gt;
&lt;li&gt;expect blunted response if given within that window&lt;/li&gt;
&lt;li&gt;defer during acute rejection, illness&lt;/li&gt;
&lt;li&gt;adjust for viral circulation
#IDSky #IDWeek &lt;/td&gt;
 &lt;td style=&#34;text-align:right;&#34;&gt; 1 &lt;/td&gt;
&lt;/tr&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/tbody&gt;
&lt;/table&gt;




&lt;h3 id=&#34;platform&#34;&gt;Bluesky vs X?
  &lt;a href=&#34;#platform&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/figure-html/unnamed-chunk-12-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Wow, both platforms actually are quite comparable. Looking at the engagements (e.g., likes, bookmarks, quotes) it seems like Bluesky has higher engagements. Let&amp;rsquo;s take a deeper dive on the engagements.&lt;/p&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/figure-html/unnamed-chunk-14-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;We used negative binomial regression to compare social media (bluesky vs X) engagement metrics between platforms across five dates, applying Benjamini-Hochberg correction for multiple testing with alpha 0.05. Significant platform differences were found for like_count (all 5 days), reply_count (3 days), and repost_count (2 days) after FDR correction. Favoring more engagement in Bluesky compared to X. Very interesting!&lt;/p&gt;
&lt;table&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; date_factor &lt;/th&gt;
   &lt;th style=&#34;text-align:left;&#34;&gt; metric &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; coef &lt;/th&gt;
   &lt;th style=&#34;text-align:right;&#34;&gt; pval_adj &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
&lt;tbody&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; quote_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 19.2115426 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.9988609 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; like_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; -1.2536129 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 0.0111716 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; reply_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -2.2155737 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.1240385 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; repost_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -1.0116009 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.2703430 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-18 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; bookmark_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.6061358 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.9009762 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; quote_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -1.2872035 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.2045081 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; like_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; -1.7996101 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 0.0000000 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; reply_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; -1.4764455 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 0.0002229 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; repost_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; -1.6381804 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 0.0002229 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-19 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; bookmark_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.3709127 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.8569110 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; quote_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.9022669 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.1240385 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; like_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; -1.4247822 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 0.0000000 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; reply_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.2529024 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.2624342 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; repost_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.4804749 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.2229932 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-20 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; bookmark_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.6623755 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.2229932 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; quote_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -2.4923102 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0544660 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; like_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; -1.5109780 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 0.0000000 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; reply_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; -0.7904070 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 0.0028636 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; repost_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; -1.1225999 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 0.0050867 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-21 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; bookmark_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -1.8732710 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0579333 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; quote_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -1.7408394 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.0685566 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; like_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; -0.8979284 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 0.0006327 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; reply_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; -1.2430010 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;font-weight: bold;background-color: rgba(144, 238, 144, 80) !important;&#34;&gt; 0.0000656 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; repost_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.6104784 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.2275125 &lt;/td&gt;
  &lt;/tr&gt;
  &lt;tr&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; 2025-10-22 &lt;/td&gt;
   &lt;td style=&#34;text-align:left;&#34;&gt; bookmark_count &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; -0.2675336 &lt;/td&gt;
   &lt;td style=&#34;text-align:right;&#34;&gt; 0.9227407 &lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;All highlighted in green rows are significant after BH correction. You can see on the coefficient column, all are in the negative range, favoring Bluesky. How do we interpret this? Let&amp;rsquo;s pay a closer attention at &lt;code&gt;like_count&lt;/code&gt;, since all of the dates appear to be different when platforms were compared, looking at official first day of ID week 10-19-25, we have &lt;code&gt;coefficient of -1.7996101&lt;/code&gt;, which means Bluesky posts had &lt;code&gt;6 times&lt;/code&gt; higher count of &lt;code&gt;likes&lt;/code&gt; compared to X posts. Using &lt;code&gt;1/exp(-1.7996101)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Now what about the repeated/duplicate posts? Some users do post on both platforms. How do we account for that? First time using Claude Code, with instructions, we detected duplicates between Bluesky and X by applying a multi-method text similarity algorithm (exact match, ≥45% Jaccard similarity, substring match, or ≥70% word overlap) to posts from users active on both platforms within a 3-hour time window, using normalized text with URLs, hashtags, and mentions removed. To validate accuracy, we performed 100 iterations of 10% stratified random sampling of cross-platform users, calculating precision, recall, F1 score, and overall accuracy by comparing algorithm-detected duplicates against ground truth determined by the same similarity criteria.&lt;/p&gt;
&lt;p&gt;Our results showed there were ~1234 unique posts as compared to the total 1272 posts. Let&amp;rsquo;s remove these duplicates and see how many total posts from both platforms&lt;/p&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/figure-html/unnamed-chunk-16-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Alright, it looks quite similar to previous! So the duplicate didn&amp;rsquo;t really make a whole lot of difference!&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s take a look whether there is a difference in engagement if we were to compare the SAME post in both platforms?&lt;/p&gt;
&lt;img src=&#34;https://www.kenkoonwong.com/blog/idweek25/index_files/figure-html/unnamed-chunk-17-1.png&#34; width=&#34;672&#34; /&gt;
&lt;p&gt;Wow, this is interesting. With only 38 same posts in both platform, we observed that the like counts on the first 3 days (10-18-25, 10-19-25, 10-20-25) with Bluesky has more likes than X, with the highest engagement at &lt;code&gt;11x higher like counts&lt;/code&gt; in Bluesky than X &lt;code&gt;from 1/exp(-2.3978953)&lt;/code&gt;. Diving into the raw data, looks like &lt;code&gt;dralicehan.bsky.social&lt;/code&gt; and &lt;code&gt;pascalisid.bsky.social&lt;/code&gt;&amp;rsquo;s post was the main one that contributed to the high engagement on Bluesky!&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;blueskypost.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;br&gt;




&lt;h2 id=&#34;ack&#34;&gt;Acknowledgement
  &lt;a href=&#34;#ack&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Thank you for those who have contributed! I learnt a lot from you guys! It&amp;rsquo;s these type of engagement that keep us all connected even for those who can&amp;rsquo;t attend the meeting! If you see your username above, thank you! If you don&amp;rsquo;t see it and have contributed, thank you! Sorry if I have left any usernames out. Please feel free to use the shiny app link above to look at all the posts queried to catch up, or just search on bluesky and X platform!&lt;/p&gt;
&lt;p&gt;I especially want to thank 
&lt;a href=&#34;https://orcid.org/0000-0003-4566-1905&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Jonathan Ryder&lt;/a&gt; and 
&lt;a href=&#34;https://orcid.org/0000-0002-1789-603X&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Joseph Marcus&lt;/a&gt; for their feedback on tweet/post analysis! They&amp;rsquo;ve been really helpful in giving me feedback I can incorporate on this blog. Thank you!&lt;/p&gt;




&lt;h2 id=&#34;lessons-learnt&#34;&gt;Lessons Learnt
  &lt;a href=&#34;#lessons-learnt&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Did not know that |&amp;gt; does not pass &lt;code&gt;.&lt;/code&gt; whereas %&amp;gt;% does 
&lt;a href=&#34;#top50&#34;&gt;see code&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;learnt about delayed RPR rise with DoxyPEP&lt;/li&gt;
&lt;li&gt;Watch for GOAT trial after July 2027&lt;/li&gt;
&lt;li&gt;Watch out for Welder&amp;rsquo;s anthrax&lt;/li&gt;
&lt;li&gt;Association of shingle vaccine &amp;amp; lower ASCVD&lt;/li&gt;
&lt;li&gt;and many more&lt;/li&gt;
&lt;li&gt;negative binomial &amp;amp; BH correction&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you like this article:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;please feel free to send me a 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;comment or visit my other blogs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;please feel free to follow me on 
&lt;a href=&#34;https://bsky.app/profile/kenkoonwong.bsky.social&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;BlueSky&lt;/a&gt;, 
&lt;a href=&#34;https://twitter.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;twitter&lt;/a&gt;, 
&lt;a href=&#34;https://github.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GitHub&lt;/a&gt; or 
&lt;a href=&#34;https://med-mastodon.com/@kenkoonwong&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Mastodon&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;if you would like collaborate please feel free to 
&lt;a href=&#34;https://www.kenkoonwong.com/contact/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;contact me&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>My Attempt To Reproduce Stanford HIVdb Sequence and Mutation Analysis From Scratch</title>
      <link>https://www.kenkoonwong.com/blog/hiv-genotype/</link>
      <pubDate>Fri, 10 Oct 2025 00:00:00 +0000</pubDate>
      
      <guid>https://www.kenkoonwong.com/blog/hiv-genotype/</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Ever wondered what M184V, K65R actually mean? I learnt it from rebuilding Stanford&amp;rsquo;s HIV resistance algorithm from scratch to find out. Spoiler: it took tons of code to match their 3-line tool. But the lesson was worth it. We went from agreement of 89% to 99.8% after different methods of alignment&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src=&#34;hiv.jpg&#34; alt=&#34;&#34;&gt;&lt;/p&gt;




&lt;h2 id=&#34;motivations&#34;&gt;Motivations:
  &lt;a href=&#34;#motivations&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;We&amp;rsquo;ve all learnt and memorize what those letters and numbers mean when it comes to antiretroviral resistance. Since we&amp;rsquo;ve been exploring genomics lately, let&amp;rsquo;s take another look at HIV genotype. 
&lt;a href=&#34;https://hivdb.stanford.edu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Stanford University HIVdb&lt;/a&gt; is an amazing resource! I&amp;rsquo;ve always been confused with all these letters and have difficulty understanding how to even check for genotype resistance because all these numbers and letters are quite confusing and intimidating. Let&amp;rsquo;s put on our bioconductor hat and revisit this topic and see if we can at least get a better understanding on what these letters and numbers mean. Better yet, use this opportunity to try to reproduce the algorithm that tells us the susceptibility of the ART! Hang tight on this one, it&amp;rsquo;s going to be a bumpy road!&lt;/p&gt;




&lt;h2 id=&#34;objectives&#34;&gt;Objectives
  &lt;a href=&#34;#objectives&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#ref&#34;&gt;Find HIV reference gene&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#resistance&#34;&gt;Find Resistance&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#workflow&#34;&gt;Workflow&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;#segment&#34;&gt;Identify&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#translate&#34;&gt;Translate&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#align&#34;&gt;Align&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#stanford&#34;&gt;Stanford&amp;rsquo;s Sierrapy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#trial1&#34;&gt;Trial 1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#trial2&#34;&gt;Trial 2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#trial3&#34;&gt;Trial 3&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#trial4&#34;&gt;Trial 4&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#trial5&#34;&gt;Trial 5&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#all&#34;&gt;Asess all&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#nucamino&#34;&gt;nucamino pipeline&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#thoughts&#34;&gt;Final Thoughts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#opportunity&#34;&gt;Opportunities for improvement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;#lessons&#34;&gt;Lessons Learnt&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;ref&#34;&gt;Find HIV Reference Gene
  &lt;a href=&#34;#ref&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;From most of the searching, the reference gene appears to be 
&lt;a href=&#34;https://www.ncbi.nlm.nih.gov/nuccore/K03455.1/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;HXB2&lt;/a&gt;. We know that pol gene encodes protease (PR), reverse transcriptase (RT) and integrase (IN). Below is directly from NCBI. 
&lt;a href=&#34;https://www.ncbi.nlm.nih.gov/gene/155348&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;source&lt;/a&gt;. You can click on it and hover over those regions to see what they are.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;pol.png&#34; alt=&#34;&#34;&gt;
Take note that all these genes reside in specific position. So the trick is to use a reference, in our case HXB2, locate these 3 genes (RT, PR, IN), then extract them and make them into a database. Some reference is pretty good at letting you know where these positions are, some don&amp;rsquo;t! As for our reference, it didn&amp;rsquo;t really say either! We&amp;rsquo;ll have to look around reference and see if we can get those position.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(Biostrings)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(DECIPHER)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;library&lt;/span&gt;(tidyverse)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## We downloaded a bunch of HIV genomes, including the reference known as K03455 (aka HXB2)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;(all_hiv_genome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;readDNAStringSet&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;all_hiv_whole_genome.fasta&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;all_fasta.png&#34; alt=&#34;&#34;&gt;
Wow, look at that. HIV genome is only around 9000bp! Whereas bacteria such as Ecoli it was about 4 Mb.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## load reference&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;hxb2_idx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;grep&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;K03455&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;(all_hiv_genome), ignore.case &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;hxb2_genome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; all_hiv_genome[hxb2_idx]  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## locate the genes&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;rt_sequence &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;subseq&lt;/span&gt;(hxb2_genome, start &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2550&lt;/span&gt;, end &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4229&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;pi_sequence &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;subseq&lt;/span&gt;(hxb2_genome, start &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2253&lt;/span&gt;, end &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2549&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;int_sequence &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;subseq&lt;/span&gt;(hxb2_genome, start &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4230&lt;/span&gt;, end &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5093&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Rename sequences with informative names&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;(pi_sequence) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_PR&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;(rt_sequence) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_RT&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;(int_sequence) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_INT&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Combine all three into one DNAStringSet&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;pol_regions &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(pi_sequence, rt_sequence, int_sequence)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Write to a single FASTA file&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;writeXStringSet&lt;/span&gt;(pol_regions, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_pol_regions.fasta&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# make it into blast database&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;makeblastdb -in hiv_pol_regions.fasta -dbtype nucl -out /path/to/hiv/hiv_pol_db -parse_seqids&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;With the above we&amp;rsquo;re basically setting up a databse for the reference pol gene of PR, RT, and IN. We will then find a genome of interest and use blast to identify where these genes are located on the sample sequence like so.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## use blast to find where these genes positions are&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#39;blastn -query all_hiv_whole_genome.fasta -db /path/to/hiv/hiv_pol_db -word_size 7 -evalue 0.0000001 -outfmt 6 -out hiv_sample_blast_results.txt&amp;#39;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## read it&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;colnames &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;qseqid&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;sseqid&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;pident&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;length&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;mismatch&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;gapopen&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;qstart&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;qend&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;sstart&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;send&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;evalue&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;bitscore&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;(all_hiv_sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_tsv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_sample_blast_results.txt&amp;#34;&lt;/span&gt;, col_names &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; colnames))
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;blast_result.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Notice &lt;code&gt;sseqid&lt;/code&gt; section, we have &lt;code&gt;HIV_RT&lt;/code&gt;, &lt;code&gt;HIV_INT&lt;/code&gt;, &lt;code&gt;HIV_PR&lt;/code&gt;. We just need to filter these genes, extract the genes via their positions &lt;code&gt;qstart&lt;/code&gt; and &lt;code&gt;qend&lt;/code&gt;. Let&amp;rsquo;s take a look at &lt;code&gt;U63632.1&lt;/code&gt;&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; all_hiv_sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(qseqid, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;U63632.1&amp;#34;&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(sseqid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_RT&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(qseqid, qstart, qend)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;OK, how do we go from here to knowing what&amp;rsquo;s mutated?&lt;/p&gt;




&lt;h2 id=&#34;resistance&#34;&gt;Find The Resistance
  &lt;a href=&#34;#resistance&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;All those letters, &lt;code&gt;M184V&lt;/code&gt;, &lt;code&gt;K65R&lt;/code&gt; etc, the letters and numbers must mean something right? You&amp;rsquo;re right! The numbers are position, but what about the letters? They don&amp;rsquo;t really look like ATCG, do they? They&amp;rsquo;re amino acids! For example, the &lt;code&gt;M&lt;/code&gt; on &lt;code&gt;M184&lt;/code&gt; stands for Methionine, the &lt;code&gt;V&lt;/code&gt; stands for Valine. So &lt;code&gt;M184V&lt;/code&gt; means at position 184, Methionine has mutated to Valine. Similarly, &lt;code&gt;K65R&lt;/code&gt; means at position 65, Lysine has mutated to Arginine.&lt;/p&gt;
&lt;p&gt;Amino acids and their letters:  &lt;br&gt;
&lt;code&gt;A&lt;/code&gt;: Alanine, &lt;code&gt;C&lt;/code&gt;: Cysteine, &lt;code&gt;D&lt;/code&gt;: Aspartic Acid, &lt;code&gt;E&lt;/code&gt;: Glutamic Acid, &lt;code&gt;F&lt;/code&gt;: Phenylalanine, &lt;code&gt;G&lt;/code&gt;: Glycine, &lt;code&gt;H&lt;/code&gt;: Histidine, &lt;code&gt;I&lt;/code&gt;: Isoleucine, &lt;code&gt;K&lt;/code&gt;: Lysine, &lt;code&gt;L&lt;/code&gt;: Leucine, &lt;code&gt;M&lt;/code&gt;: Methionine, &lt;code&gt;N&lt;/code&gt;: Asparagine, &lt;code&gt;P&lt;/code&gt;: Proline, &lt;code&gt;Q&lt;/code&gt;: Glutamine, &lt;code&gt;R&lt;/code&gt;: Arginine, &lt;code&gt;S&lt;/code&gt;: Serine, &lt;code&gt;T&lt;/code&gt;: Threonine, &lt;code&gt;V&lt;/code&gt;: Valine, &lt;code&gt;W&lt;/code&gt;: Tryptophan, &lt;code&gt;Y&lt;/code&gt;: Tyrosine&lt;/p&gt;
&lt;p&gt;his also means that our translated RT reference HIV gene HXB2 at position 184, we should expect the amino acid to be M. Let&amp;rsquo;s take a look&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;(rt_t &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;translate&lt;/span&gt;(rt_sequence))
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;aa.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Wow! We&amp;rsquo;re used to seeing DNA sequence colors but now take a look at the different amino acid colors! How pretty! Now let&amp;rsquo;s look at position 184.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;subseq&lt;/span&gt;(rt_t, &lt;span style=&#34;color:#099&#34;&gt;184&lt;/span&gt;,&lt;span style=&#34;color:#099&#34;&gt;184&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;rt_t_184.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Alright! &lt;code&gt;M&lt;/code&gt; indeed! Now, let&amp;rsquo;s go through the workflow of using this information to run through our sample&lt;/p&gt;




&lt;h2 id=&#34;workflow&#34;&gt;Workflow
  &lt;a href=&#34;#workflow&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;TL;DR Extract region -&amp;gt; Align Translation of AA with Reference &amp;amp; Assess Mutation -&amp;gt; Calculate Mutation Score&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3 id=&#34;segment&#34;&gt;Identify/Extract Regions
  &lt;a href=&#34;#segment&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; all_hiv_sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(qseqid, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;U63632.1&amp;#34;&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(sseqid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_RT&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(qseqid, qstart, qend)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample_rt &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;subseq&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  all_hiv_genome&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;[str_detect&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;(all_hiv_genome), sample1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;qseqid)],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  start &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;qstart,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  end &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;qend
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We&amp;rsquo;ve done this process above, but for other genes, we&amp;rsquo;d need to write a function to change &lt;code&gt;HIV_RT&lt;/code&gt; to the others.&lt;/p&gt;




&lt;h3 id=&#34;translate&#34;&gt;Translate
  &lt;a href=&#34;#translate&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Just for demonstration purposes:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#### Take note, this entire code chunk is not needed !!! &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;seq_length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; sample1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;qend &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; sample1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;qstart &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;adjusted_length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; seq_length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; (seq_length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%%&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample_rt &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;subseq&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  all_hiv_genome&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;[str_detect&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;(all_hiv_genome), sample1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;qseqid)],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  start &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;qstart,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  end &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;qstart &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;+&lt;/span&gt; adjusted_length &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;(sample_rt_t &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;translate&lt;/span&gt;(sample_rt, if.fuzzy.codon &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;solve&amp;#34;&lt;/span&gt;))  
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;translate_sample.png&#34; alt=&#34;&#34;&gt;
The reason we had to adjust length, sometimes, is that the sequence may not be perfectly divisible by 3. Since codons are groups of 3 nucleotides, any extra nucleotides that don&amp;rsquo;t form a complete codon would lead to an incomplete translation. By adjusting the length to be divisible by 3, we ensure that the translation process can proceed without any issues. During translation process, we could also use &lt;code&gt;solve&lt;/code&gt; to handle any fuzzy codons that may arise due to sequencing errors or ambiguities in the nucleotide sequence. This ensures that the translation process can still proceed even if there are some uncertainties in the input sequence.&lt;/p&gt;
&lt;p&gt;But in reality, we can align them directly from our DNA sequence with the help of DECIPHER, it will align the translated AA!&lt;/p&gt;




&lt;h3 id=&#34;align&#34;&gt;Align Translation &amp;amp; Assess Mutation
  &lt;a href=&#34;#align&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# this automatically translate aligned seq into aligned AA, sweet !!! &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;alignseq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;AlignTranslation&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(rt_sequence,sample_rt), type&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;AAStringSet&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# turn alignment into matrix&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;align_matrix &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.matrix&lt;/span&gt;(alignseq)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# extract alignment on both ref and sample&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ref_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; align_matrix[1,]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; align_matrix[2,]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# find position where there is mutation&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mutation_positions &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;which&lt;/span&gt;(ref_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!=&lt;/span&gt; sample_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; ref_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;-&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; sample_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;-&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# load into dataframe&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;data.frame&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  position &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; mutation_positions,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  reference &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; ref_seq[mutation_positions],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample_seq[mutation_positions],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mutation &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(ref_seq[mutation_positions], mutation_positions, sample_seq[mutation_positions])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(position_replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(position,sample))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mutations&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;mutation
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;mutation.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Lots of mutations. HIV viruses are notorious for SNPs which may or may not be clinically significant. And with just a single nucleotide mutation, as you can see the translated amino acid could be different from reference. OK, since we know how to change for mutations, how do we know if this is the same as Stanford HIVdb? Let&amp;rsquo;s copy and paste the entire genome of this sample and paste it 
&lt;a href=&#34;https://hivdb.stanford.edu/hivdb/by-sequences/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;here&lt;/a&gt;&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## easy way of copying to system so we can paste on Stanford website (link above)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;all_hiv_genome&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;[str_detect&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;(all_hiv_genome),&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;U63632.1&amp;#34;&lt;/span&gt;)] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.character&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; clipr&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;write_clip&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;stanford.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Not too shabby! The one mutation that is significant is &lt;code&gt;M348I&lt;/code&gt;. We were able to capture that, not too shabby! Now, how do we go from here to the inferred susceptibility of NRTIs and NNRTIs? In come the mutation score algorithm!&lt;/p&gt;




&lt;h3 id=&#34;calculate-mutation-score&#34;&gt;Calculate Mutation Score
  &lt;a href=&#34;#calculate-mutation-score&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;This part is very interesting. My initial trial was using their [genotype-phenotype DRMcv model]((
&lt;a href=&#34;https://hivdb.stanford.edu/download/GenoPhenoDatasets/DRMcv.R&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://hivdb.stanford.edu/download/GenoPhenoDatasets/DRMcv.R&lt;/a&gt;) but I couldn&amp;rsquo;t reproduce most of the results. And realized that 
&lt;a href=&#34;https://hivdb.stanford.edu/hivdb/by-sequences/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Stanford HIVdb Sequance Analysis&lt;/a&gt; uses 
&lt;a href=&#34;https://hivdb.stanford.edu/page/release-notes/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;another algorithm&lt;/a&gt;. And this algorithm, at least for the final product, is more straight forward than trying to use a model to reproduce a prediction, which was what the DRMcv was using &lt;code&gt;glmnet&lt;/code&gt; and &lt;code&gt;lasso&lt;/code&gt;. There is also 
&lt;a href=&#34;https://cms.hivdb.org/prod/downloads/release-notes/genotypic-resistance-test-interpretation-system-oct2019.pdf&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;extensive documentation&lt;/a&gt;. All of the &lt;code&gt;csv&lt;/code&gt; files below were obtained from the links provided from the documentation. Let&amp;rsquo;s code!&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;load_hivdb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(dataset, mutations){
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(dataset&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;NRTI&amp;#34;&lt;/span&gt;) {  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_nrti_single.csv&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_nrti_combo.csv&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(dataset&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;NNRTI&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_nnrti_single.csv&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_nnrti_combo.csv&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(dataset&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;PI&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_pi_single.csv&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_pi_combo.csv&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(dataset&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;INSTI&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_insti_single.csv&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_insti_combo.csv&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_interest &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(mut_i in mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(Rule)) { 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    add_mut &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    add_mut &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt;  mut_i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; (mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(mutation))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(add_mut) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      mut_interest &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(mut_interest, mut_i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mut_interest)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;           &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Get 0 rows but keep structure&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;           &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;           &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summarise&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;across&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pivot_longer&lt;/span&gt;(cols &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), names_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ART&amp;#34;&lt;/span&gt;, values_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;score&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;9&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;14&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;15&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;29&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;30&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;59&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;60&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(interpretation &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;susceptible&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;potential low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;intermediate resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;high-level resistance&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            )))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_combo_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(combination_rule) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; + &amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_split&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; \\+ &amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unlist&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unique&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_combo_idx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_combo_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; (mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(mutation))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Function to sort mutations by position number&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sort_mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(mutation_string) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Split the string by &amp;#34; + &amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  positions &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_extract&lt;/span&gt;(mutation_string, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;\\d+&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Sort mutations by their positions&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  sorted_mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mutation_string&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;[order&lt;/span&gt;(positions)]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(sorted_mutations)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_combo_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sort_mutations&lt;/span&gt;(mut_combo_vec[mut_combo_idx])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Function to create combinations of a given size&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;create_combinations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(mutations_, size) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;combn&lt;/span&gt;(mutations_, size, FUN &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(x) &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(x, collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; + &amp;#34;&lt;/span&gt;), simplify &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mut_combo_seq)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;) { 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score_sum &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(Rule &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; mut_interest) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summarize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;across&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), sum)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pivot_longer&lt;/span&gt;(cols &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), names_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ART&amp;#34;&lt;/span&gt;, values_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;score&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;9&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;14&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;15&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;29&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;30&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;59&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;60&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(interpretation &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;susceptible&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;potential low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;intermediate resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;high-level resistance&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    ))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  } else {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Generate combinations of size 2, 3, and 4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;all_combinations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;map&lt;/span&gt;(n&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;m, &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;create_combinations&lt;/span&gt;(mut_combo_seq, .x)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unlist&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# filter from mut_score_combo&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_score_combo_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(combination_rule &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; all_combinations) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rename&lt;/span&gt;(Rule &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; combination_rule)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# sum single + combo&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_score_sum &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(Rule &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; mut_interest) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bind_rows&lt;/span&gt;(mut_score_combo_df) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summarize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;across&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), sum)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pivot_longer&lt;/span&gt;(cols &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), names_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ART&amp;#34;&lt;/span&gt;, values_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;score&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;9&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;14&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;15&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;29&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;30&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;59&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;60&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(interpretation &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;susceptible&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;potential low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;intermediate resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;high-level resistance&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    ))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(mut_score_sum)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;(nrti &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;load_hivdb&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;NRTI&amp;#34;&lt;/span&gt;, mutations))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;(nnrti &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;load_hivdb&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;NNRTI&amp;#34;&lt;/span&gt;, mutations))
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;nrti_mut.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;nnrti_mut.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;Wow, long code but we were able to reproduce the same score as Stanford HIVdb! Not too shabby. Take note that if certain mutations exist together, there is another table that shows additional penalty score! For example, if our sequence contain &lt;code&gt;M41L + T215FY&lt;/code&gt; mutation, the total mutation score for TDF is &lt;code&gt;5+10+10=25&lt;/code&gt;, it&amp;rsquo;s not just &lt;code&gt;5+10&lt;/code&gt;. 
&lt;a href=&#34;https://hivdb.stanford.edu/dr-summary/mut-scores/NRTI/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;see here for full table&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Now, we just need to repeat this for PR and IN, write it in a pipeline and there you have it! Let&amp;rsquo;s look at their official web service via &lt;code&gt;sierrapy&lt;/code&gt; and then see if we can reproduce with our algorithm!&lt;/p&gt;
&lt;br&gt;




&lt;h3 id=&#34;stanford&#34;&gt;Stanford Sierrapy
  &lt;a href=&#34;#stanford&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Let&amp;rsquo;s explore a much simpler and expert-driven approach. Let&amp;rsquo;s try Stanford HIVdb&amp;rsquo;s SierraPy, their python equivalent. This uses their web service, hence will need internet and also need to send sequence as well. Very fast.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# install&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;pip install sierrapy
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# write fasta of sample of interest, let&amp;#39;s take a look at KJ849778.1&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;all_hiv_genome[all_hiv_genome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;KJ849778.1&amp;#34;&lt;/span&gt;)] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;writeXStringSet&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_test.fasta&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# run sierrapy input fasta&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;sierrapy fasta hiv_test.fasta -o hiv_test_output.json&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# read json output&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;jsonlite&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_json&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_test_output.0.json&amp;#34;&lt;/span&gt;, simplifyVector &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;drugResistance[[1]][3]&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;drugScores
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;sierrapy.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Alright. 🤔 with 3 lines of code after you have an assembled HIV whole genome, Sierra web service via 
&lt;a href=&#34;https://github.com/hivdb/sierra-client/tree/master/python&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;&lt;code&gt;Sierrapy&lt;/code&gt;&lt;/a&gt; will provide you the susceptibility interpretation within seconds! The json also provides you with the alignments as well! You do need internet for this thought, if you&amp;rsquo;re looking for local interpretation then can look into 
&lt;a href=&#34;https://github.com/PoonLab/sierra-local&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;&lt;code&gt;Sierra Local&lt;/code&gt;&lt;/a&gt;. There are lots of ways to get to this without having the have heavy codes like we did! lol, but it&amp;rsquo;s great to know how it works under the hood!&lt;/p&gt;
&lt;p&gt;Now, since we know what Stanford HIVdb shows, let&amp;rsquo;s see if our algorithm will return the same result!&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;extract_align &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(sample,class) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;exists&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hxb2_genome&amp;#34;&lt;/span&gt;)) { &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;stop&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;you need load/enter hxb2_genome&amp;#34;&lt;/span&gt;) } 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample1 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; all_hiv_sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(qseqid, sample)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(sseqid &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; class) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(qseqid, qstart, qend)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample_rt &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;subseq&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  all_hiv_genome&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;[str_detect&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;(all_hiv_genome), sample1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;qseqid)],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  start &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;qstart,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  end &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample1&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;qend
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;## locate the genes&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(class &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_RT&amp;#34;&lt;/span&gt;) { rt_sequence &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;subseq&lt;/span&gt;(hxb2_genome, start &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2550&lt;/span&gt;, end &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4229&lt;/span&gt;) }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(class &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_PR&amp;#34;&lt;/span&gt;) { rt_sequence &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;subseq&lt;/span&gt;(hxb2_genome, start &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2253&lt;/span&gt;, end &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2549&lt;/span&gt;) }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(class &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_INT&amp;#34;&lt;/span&gt;) { rt_sequence &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;subseq&lt;/span&gt;(hxb2_genome, start &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4230&lt;/span&gt;, end &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5093&lt;/span&gt;) }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# this automatically translate aligned seq into aligned AA, sweet !!! &lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# alignseq &amp;lt;- AlignTranslation(c(rt_sequence,sample_rt), type=&amp;#34;AAStringSet&amp;#34;, verbose = F)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;alignseq_nt &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;AlignSeqs&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(rt_sequence,sample_rt),verbose&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ref_pos &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;align_ref_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.matrix&lt;/span&gt;(alignseq_nt) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; _[1,] 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(align_ref_seq)) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(align_ref_seq[i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;LETTERS&lt;/span&gt;) { ref_pos &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(ref_pos, i) }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;align_ref_seq2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; align_ref_seq[ref_pos] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(collapse&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;DNAStringSet&lt;/span&gt;() 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;align_sample_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.matrix&lt;/span&gt;(alignseq_nt) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; _[2, ref_pos] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_replace&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;-&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;G&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(collapse&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;DNAStringSet&lt;/span&gt;() 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;alignseq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;AlignTranslation&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(align_ref_seq2,align_sample_seq), verbose&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt;, type &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;AAStringSet&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# which(align_ref_seq[ref_pos] != align_sample_seq)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# turn alignment into matrix&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;align_matrix &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.matrix&lt;/span&gt;(alignseq)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# extract alignment on both ref and sample&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(i in &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(align_matrix[1,])) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(align_matrix[1,][i] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;LETTERS&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    start_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; i
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    break
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;ref_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; align_matrix[1,start_seq&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(align_matrix[1,])]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; align_matrix[2,start_seq&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(align_matrix[1,])]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# find position where there is mutation&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mutation_positions &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;which&lt;/span&gt;(ref_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!=&lt;/span&gt; sample_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; ref_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;-&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; sample_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;-&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# load into dataframe&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;data.frame&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  position &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; mutation_positions,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  reference &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; ref_seq[mutation_positions],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample_seq[mutation_positions],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mutation &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(ref_seq[mutation_positions], mutation_positions, sample_seq[mutation_positions])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(position_replace &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste0&lt;/span&gt;(position,sample))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(mutations)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;load_hivdb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(dataset, mutations){
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(dataset&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;NRTI&amp;#34;&lt;/span&gt;) {  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_nrti_single.csv&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_nrti_combo.csv&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(dataset&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;NNRTI&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_nnrti_single.csv&amp;#34;&lt;/span&gt;,show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_nnrti_combo.csv&amp;#34;&lt;/span&gt;,show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(dataset&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;PI&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_pi_single.csv&amp;#34;&lt;/span&gt;,show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_pi_combo.csv&amp;#34;&lt;/span&gt;,show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(dataset&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;INSTI&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_insti_single.csv&amp;#34;&lt;/span&gt;,show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_insti_combo.csv&amp;#34;&lt;/span&gt;,show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_interest &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(mut_i in mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(Rule)) { 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    add_mut &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    add_mut &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt;  mut_i &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; (mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(mutation))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(add_mut) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      mut_interest &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(mut_interest, mut_i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mut_interest)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;           &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Get 0 rows but keep structure&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;           &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;           &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summarise&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;across&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pivot_longer&lt;/span&gt;(cols &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), names_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ART&amp;#34;&lt;/span&gt;, values_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;score&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;9&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;14&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;15&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;29&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;30&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;59&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;60&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(interpretation &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;susceptible&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;potential low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;intermediate resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;high-level resistance&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            )))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_combo_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(combination_rule) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; + &amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_split&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; \\+ &amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unlist&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unique&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_combo_idx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_combo_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; (mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(mutation))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Function to sort mutations by position number&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sort_mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(mutation_string) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Split the string by &amp;#34; + &amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  positions &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_extract&lt;/span&gt;(mutation_string, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;\\d+&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Sort mutations by their positions&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  sorted_mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mutation_string&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;[order&lt;/span&gt;(positions)]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(sorted_mutations)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_combo_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sort_mutations&lt;/span&gt;(mut_combo_vec[mut_combo_idx])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Function to create combinations of a given size&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;create_combinations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(mutations_, size) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;combn&lt;/span&gt;(mutations_, size, FUN &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(x) &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(x, collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; + &amp;#34;&lt;/span&gt;), simplify &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mut_combo_seq)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;) { 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score_sum &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(Rule &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; mut_interest) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summarize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;across&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), sum)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pivot_longer&lt;/span&gt;(cols &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), names_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ART&amp;#34;&lt;/span&gt;, values_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;score&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ATV_r&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ATV/r&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;DRV_r&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;DRV/r&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;LPV_r&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;LPV/r&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; ART
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;9&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;14&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;15&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;29&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;30&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;59&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;60&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(interpretation &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;susceptible&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;potential low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;intermediate resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;high-level resistance&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    ))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  } else {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Generate combinations of size 2, 3, and 4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;all_combinations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;map&lt;/span&gt;(n&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;m, &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;create_combinations&lt;/span&gt;(mut_combo_seq, .x)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unlist&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# filter from mut_score_combo&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_score_combo_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(combination_rule &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; all_combinations) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rename&lt;/span&gt;(Rule &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; combination_rule)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# sum single + combo&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_score_sum &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(Rule &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; mut_interest) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bind_rows&lt;/span&gt;(mut_score_combo_df) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summarize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;across&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), sum)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pivot_longer&lt;/span&gt;(cols &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), names_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ART&amp;#34;&lt;/span&gt;, values_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;score&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;9&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;14&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;15&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;29&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;30&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;59&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;60&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(interpretation &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;susceptible&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;potential low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;intermediate resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;high-level resistance&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    ))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(mut_score_sum)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;hiv_genotype &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(sample&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;KJ849778.1&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;class_group &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_RT&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_PR&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_INT&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(class in class_group) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;extract_align&lt;/span&gt;(sample, class)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(class &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_RT&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;load_hivdb&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;NRTI&amp;#34;&lt;/span&gt;,mutations))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;load_hivdb&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;NNRTI&amp;#34;&lt;/span&gt;,mutations))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(class &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_PR&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;load_hivdb&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;PI&amp;#34;&lt;/span&gt;,mutations))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(class &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_INT&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;print&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;load_hivdb&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;INSTI&amp;#34;&lt;/span&gt;,mutations))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  } 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;our_algo.png&#34; alt=&#34;image&#34; width=&#34;50%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;WHAT !?!?! My algorithm failed 💔 !!! ❌❌❌ Noooo&amp;hellip;!!! Ths iis very odd. Let&amp;rsquo;s inspect!&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s pick RT and inspect.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;KJ849778.1&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;extract_align&lt;/span&gt;(class &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;HIV_RT&amp;#34;&lt;/span&gt;, sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(mutation)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;inspect1.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s look at Stanford&amp;rsquo;s&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;inspect1_ref.png&#34; alt=&#34;&#34;&gt;
Do you see what i&amp;rsquo;m seeing? Our aligned AA has a lot of &lt;code&gt;X&lt;/code&gt;&amp;rsquo;s and these X&amp;rsquo;s coincide with Stanford&amp;rsquo;s. For example &lt;code&gt;M184X&lt;/code&gt; on ours is Stanford&amp;rsquo;s &lt;code&gt;M184MI&lt;/code&gt;. Our &lt;code&gt;G190X&lt;/code&gt; is their &lt;code&gt;G190EKR&lt;/code&gt;. 🤔 And some of the mutations are the same, so it&amp;rsquo;s not a frame issue. Oh wait !!! All &lt;code&gt;X&lt;/code&gt;&amp;rsquo;s because we cannot assess what exactly the amino acid is, we can&amp;rsquo;t tell if there is mutation at all, hence we assume it could be any !?! That total makes sense! That means we&amp;rsquo;ll have to incorporate this into our algorithm! And also, not shown here, if there is missing, algorithm should also choose the max mutation score as penalty. Now let&amp;rsquo;s implement that!&lt;/p&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;load_hivdb &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(dataset, mutations){
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(dataset&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;NRTI&amp;#34;&lt;/span&gt;) {  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_nrti_single.csv&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_nrti_combo.csv&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(dataset&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;NNRTI&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_nnrti_single.csv&amp;#34;&lt;/span&gt;,show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_nnrti_combo.csv&amp;#34;&lt;/span&gt;,show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(dataset&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;PI&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_pi_single.csv&amp;#34;&lt;/span&gt;,show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_pi_combo.csv&amp;#34;&lt;/span&gt;,show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rename&lt;/span&gt;(`ATV/r`&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;ATV_r,`DRV/r`&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;DRV_r,`LPV/r`&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;LPV_r)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(dataset&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;INSTI&amp;#34;&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_insti_single.csv&amp;#34;&lt;/span&gt;,show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;) 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  l74 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(Rule,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;L74&amp;#34;&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;Rule) 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  l74_2 &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;tibble&lt;/span&gt;(Rule &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;L74M&amp;#34;&lt;/span&gt;,&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;L74F&amp;#34;&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bind_cols&lt;/span&gt;(l74)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;!&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(Rule, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;L74&amp;#34;&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bind_rows&lt;/span&gt;(l74_2)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_csv&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hivdb_insti_combo.csv&amp;#34;&lt;/span&gt;,show_col_types &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;FALSE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_interest &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# check if there is X&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sum&lt;/span&gt;(mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(mutation) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;X$&amp;#34;&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mutations_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_replace&lt;/span&gt;(mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(mutation), pattern &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;X&amp;#34;&lt;/span&gt;, replacement &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;} else { 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mutations_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(mutation)}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;for &lt;/span&gt;(mut_i in mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(Rule)) { 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    add_mut &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;F&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    add_mut &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(mut_i, &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(mutations_vec,collapse&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;|&amp;#34;&lt;/span&gt;))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(add_mut) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      mut_interest &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;c&lt;/span&gt;(mut_interest, mut_i)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mut_interest)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;           &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;slice_sample&lt;/span&gt;(n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Get 0 rows but keep structure&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;           &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;           &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summarise&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;across&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;0&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pivot_longer&lt;/span&gt;(cols &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), names_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ART&amp;#34;&lt;/span&gt;, values_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;score&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;                   &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ATV_r&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ATV/r&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;DRV_r&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;DRV/r&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;LPV_r&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;LPV/r&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; ART
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;9&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;14&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;15&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;29&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;30&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;59&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;60&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(interpretation &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;susceptible&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;potential low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;intermediate resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;              levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;high-level resistance&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            )))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_combo_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(combination_rule) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; + &amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_split&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; \\+ &amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unlist&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unique&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_combo_idx &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_combo_vec &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; (mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pull&lt;/span&gt;(mutation))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Function to sort mutations by position number&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sort_mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(mutation_string) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Split the string by &amp;#34; + &amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  positions &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_extract&lt;/span&gt;(mutation_string, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;\\d+&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;as.numeric&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Sort mutations by their positions&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  sorted_mutations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mutation_string&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;[order&lt;/span&gt;(positions)]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(sorted_mutations)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_combo_seq &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sort_mutations&lt;/span&gt;(mut_combo_vec[mut_combo_idx])
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Function to create combinations of a given size&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;create_combinations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(mutations_, size) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;combn&lt;/span&gt;(mutations_, size, FUN &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;function&lt;/span&gt;(x) &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;paste&lt;/span&gt;(x, collapse &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; + &amp;#34;&lt;/span&gt;), simplify &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;#sum(mutations |&amp;gt; pull(mutation) |&amp;gt; str_detect(&amp;#34;X$&amp;#34;)) &amp;gt; 0&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mut_combo_seq)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt;&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;) { 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  mut_score_sum &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(Rule &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; mut_interest) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(Rule_x &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_extract&lt;/span&gt;(Rule, &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;^[A-Z]{1}[0-9]+&amp;#34;&lt;/span&gt;)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# distinct(Rule_x, .keep_all = T) |&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;group_by&lt;/span&gt;(Rule_x) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summarize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;across&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(),max)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;Rule,&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;-&lt;/span&gt;Rule_x) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summarize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;across&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), sum)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pivot_longer&lt;/span&gt;(cols &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), names_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ART&amp;#34;&lt;/span&gt;, values_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;score&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ATV_r&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ATV/r&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;DRV_r&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;DRV/r&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;LPV_r&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;LPV/r&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; ART
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;9&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;14&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;15&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;29&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;30&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;59&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;60&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(interpretation &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;susceptible&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;potential low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;intermediate resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;high-level resistance&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    ))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  } else {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Generate combinations of size 2, 3, and 4&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mut_combo_seq)) { m &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mut_combo_seq) }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;if &lt;/span&gt;(n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mut_combo_seq)) { n &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;length&lt;/span&gt;(mut_combo_seq) }  
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;all_combinations &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;map&lt;/span&gt;(n&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;:&lt;/span&gt;m, &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;create_combinations&lt;/span&gt;(mut_combo_seq, .x)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; 
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unlist&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# filter from mut_score_combo&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_score_combo_df &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score_combo &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(combination_rule &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; all_combinations) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;rename&lt;/span&gt;(Rule &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; combination_rule)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# sum single + combo&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;mut_score_sum &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; mut_score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;filter&lt;/span&gt;(Rule &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;%in%&lt;/span&gt; mut_interest) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;bind_rows&lt;/span&gt;(mut_score_combo_df) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;select&lt;/span&gt;(&lt;span style=&#34;color:#099&#34;&gt;-1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;summarize&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;across&lt;/span&gt;(&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), sum)) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;pivot_longer&lt;/span&gt;(cols &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;everything&lt;/span&gt;(), names_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ART&amp;#34;&lt;/span&gt;, values_to &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;score&amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ATV_r&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;ATV/r&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;DRV_r&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;DRV/r&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      ART &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;LPV_r&amp;#34;&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;LPV/r&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;TRUE&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; ART
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;9&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;10&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;14&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;15&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;29&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;30&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;amp;&lt;/span&gt; score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;59&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      score &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;gt;=&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;60&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    )) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;mutate&lt;/span&gt;(interpretation &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;case_when&lt;/span&gt;(
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;susceptible&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;2&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;potential low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;3&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;low-level resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;4&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;intermediate resistance&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      levels &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;==&lt;/span&gt; &lt;span style=&#34;color:#099&#34;&gt;5&lt;/span&gt; &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;~&lt;/span&gt; &lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;high-level resistance&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    ))
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;return&lt;/span&gt;(mut_score_sum)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;hiv_genotype&lt;/span&gt;()
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;try2.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;
&lt;p&gt;Not too shabby! Now our NNRTI and PI look exactly the same! But our INSTI is not great! Mainly because there are a quite a few deletions that our alignment and their alignments don&amp;rsquo;t agree.&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s try another genome and see if our algorithm agrees with Stanford&amp;rsquo;s&lt;/p&gt;




&lt;h3 id=&#34;trial1&#34;&gt;Trial 1 ✅
  &lt;a href=&#34;#trial1&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;




&lt;h4 id=&#34;our-code&#34;&gt;Our code
  &lt;a href=&#34;#our-code&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sample&lt;/span&gt;(all_hiv_genome,&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_split&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unlist&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; _[1]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;hiv_genotype&lt;/span&gt;(sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;final1.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;




&lt;h4 id=&#34;stanfords&#34;&gt;Stanford&amp;rsquo;s
  &lt;a href=&#34;#stanfords&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;all_hiv_genome[all_hiv_genome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(sample)] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;writeXStringSet&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_test.fasta&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# run sierrapy input fasta&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;sierrapy fasta hiv_test.fasta -o hiv_test_output.json&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# read json output&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;jsonlite&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_json&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_test_output.0.json&amp;#34;&lt;/span&gt;, simplifyVector &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;drugResistance[[1]][3]&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;drugScores
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;img src=&#34;final1_sierra.png&#34; alt=&#34;&#34;&gt;
Alright! Not too shabby! That was &lt;code&gt;AY900571.2&lt;/code&gt; accession.&lt;/p&gt;




&lt;h3 id=&#34;trial2&#34;&gt;Trial 2 ✅
  &lt;a href=&#34;#trial2&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;Let&amp;rsquo;s try another one.&lt;/p&gt;




&lt;h4 id=&#34;our-code-1&#34;&gt;Our code
  &lt;a href=&#34;#our-code-1&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sample&lt;/span&gt;(all_hiv_genome,&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_split&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unlist&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; _[1]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;hiv_genotype&lt;/span&gt;(sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;final2.png&#34; alt=&#34;image&#34; width=&#34;50%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;




&lt;h4 id=&#34;stanfords-1&#34;&gt;Stanford&amp;rsquo;s
  &lt;a href=&#34;#stanfords-1&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;all_hiv_genome[all_hiv_genome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(sample)] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;writeXStringSet&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_test.fasta&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# run sierrapy input fasta&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;sierrapy fasta hiv_test.fasta -o hiv_test_output.json&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# read json output&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;jsonlite&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_json&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_test_output.0.json&amp;#34;&lt;/span&gt;, simplifyVector &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;drugResistance[[1]][3]&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;drugScores
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p&gt;&lt;img src=&#34;final2_sierra.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;Awesome! No resistance at all! That was &lt;code&gt;FM877777.1&lt;/code&gt;. Let&amp;rsquo;s give another a go!&lt;/p&gt;




&lt;h3 id=&#34;trial3&#34;&gt;Trial 3 ✅
  &lt;a href=&#34;#trial3&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;




&lt;h4 id=&#34;our-code-2&#34;&gt;Our code
  &lt;a href=&#34;#our-code-2&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sample&lt;/span&gt;(all_hiv_genome,&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_split&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unlist&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; _[1]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;hiv_genotype&lt;/span&gt;(sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;final3.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;




&lt;h4 id=&#34;stanfords-2&#34;&gt;Stanford&amp;rsquo;s
  &lt;a href=&#34;#stanfords-2&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;all_hiv_genome[all_hiv_genome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(sample)] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;writeXStringSet&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_test.fasta&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# run sierrapy input fasta&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;sierrapy fasta hiv_test.fasta -o hiv_test_output.json&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# read json output&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;jsonlite&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_json&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_test_output.0.json&amp;#34;&lt;/span&gt;, simplifyVector &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;drugResistance[[1]][3]&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;drugScores
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p&gt;&lt;img src=&#34;final3_sierra.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;There you go! Our algorithm works! That was &lt;code&gt;EU448295.1&lt;/code&gt;. Let&amp;rsquo;s do another one!&lt;/p&gt;




&lt;h3 id=&#34;trial4&#34;&gt;Trial 4 ✅
  &lt;a href=&#34;#trial4&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;




&lt;h4 id=&#34;our-code-3&#34;&gt;Our code
  &lt;a href=&#34;#our-code-3&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sample&lt;/span&gt;(all_hiv_genome,&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_split&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unlist&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; _[1]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;hiv_genotype&lt;/span&gt;(sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;final4.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;




&lt;h4 id=&#34;stanfords-3&#34;&gt;Stanford&amp;rsquo;s
  &lt;a href=&#34;#stanfords-3&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;all_hiv_genome[all_hiv_genome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(sample)] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;writeXStringSet&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_test.fasta&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# run sierrapy input fasta&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;sierrapy fasta hiv_test.fasta -o hiv_test_output.json&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# read json output&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;jsonlite&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_json&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_test_output.0.json&amp;#34;&lt;/span&gt;, simplifyVector &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;drugResistance[[1]][3]&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;drugScores
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p&gt;&lt;img src=&#34;final4_sierra.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;There you go! Our algorithm works again! That was &lt;code&gt;EU735539.1&lt;/code&gt;. Let&amp;rsquo;s do our last one.&lt;/p&gt;




&lt;h3 id=&#34;trial5&#34;&gt;Trial 5 ✅
  &lt;a href=&#34;#trial5&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;




&lt;h4 id=&#34;our-code-4&#34;&gt;Our code
  &lt;a href=&#34;#our-code-4&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;&amp;lt;-&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;sample&lt;/span&gt;(all_hiv_genome,&lt;span style=&#34;color:#099&#34;&gt;1&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_split&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34; &amp;#34;&lt;/span&gt;) &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;unlist&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; _[1]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;hiv_genotype&lt;/span&gt;(sample &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; sample)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p align=&#34;center&#34;&gt;
  &lt;img src=&#34;final5.png&#34; alt=&#34;image&#34; width=&#34;60%&#34; height=&#34;auto&#34;&gt;
&lt;/p&gt;




&lt;h4 id=&#34;stanfords-4&#34;&gt;Stanford&amp;rsquo;s
  &lt;a href=&#34;#stanfords-4&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;details&gt;
&lt;summary&gt;code&lt;/summary&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-r&#34; data-lang=&#34;r&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;all_hiv_genome[all_hiv_genome &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;names&lt;/span&gt;() &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;str_detect&lt;/span&gt;(sample)] &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;|&amp;gt;&lt;/span&gt; &lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;writeXStringSet&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_test.fasta&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# run sierrapy input fasta&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;system&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;sierrapy fasta hiv_test.fasta -o hiv_test_output.json&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# read json output&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;jsonlite&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;::&lt;/span&gt;&lt;span style=&#34;color:#900;font-weight:bold&#34;&gt;read_json&lt;/span&gt;(&lt;span style=&#34;color:#d14&#34;&gt;&amp;#34;hiv_test_output.0.json&amp;#34;&lt;/span&gt;, simplifyVector &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#999&#34;&gt;T&lt;/span&gt;)&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;drugResistance[[1]][3]&lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;$&lt;/span&gt;drugScores
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/details&gt;
&lt;p&gt;&lt;img src=&#34;final5_sierra.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;
&lt;p&gt;That was &lt;code&gt;AF443091.1&lt;/code&gt;. And 5 of 5 correct ✅ !  Hurray !!!!&lt;/p&gt;




&lt;h3 id=&#34;all&#34;&gt;Assess All
  &lt;a href=&#34;#all&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;We did rewrite some of the functions (not shown here), compared our algorithm and Sierra&amp;rsquo;s of all the downaloded HIV genome fastas with complete RT, PR, INT positions from blast (n=494). Found 96.2% full agreement (n=475), meaning 0 deviation of level of any ART from both algorithm on the same genome sample. Not too shabby at all! I suspect the 3.4% is mainly alignment issues. I had initially used &lt;code&gt;AlignTranslation&lt;/code&gt; prior to sequence alignemtn and it only gave me 89% agreement. Found out that I had to create my own function to align it properly (align DNA sequence first, then use reference sequence repositioning and extract sample sequence, then aligntranslate). I also found out that INSTI mutation score table is missing something.&lt;/p&gt;




&lt;h3 id=&#34;investigation&#34;&gt;Investigation
  &lt;a href=&#34;#investigation&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;
&lt;p&gt;&lt;img src=&#34;plot.png&#34; alt=&#34;&#34;&gt;&lt;/p&gt;




&lt;h3 id=&#34;nucamino&#34;&gt;Writing a Pipeline with Nucamino
  &lt;a href=&#34;#nucamino&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h3&gt;




&lt;h4 id=&#34;install-nucamino&#34;&gt;Install nucamino
  &lt;a href=&#34;#install-nucamino&#34;&gt;&lt;/a&gt;
&lt;/h4&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;background-color:#fff;-moz-tab-size:4;-o-tab-size:4;tab-size:4;&#34;&gt;&lt;code class=&#34;language-bash&#34; data-lang=&#34;bash&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Clone the repository&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;git clone https://github.com/hivdb/nucamino.git
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#0086b3&#34;&gt;cd&lt;/span&gt; nucamino
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Initialize Go modules (if needed)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;go mod init github.com/hivdb/nucamino 2&amp;gt;/dev/null &lt;span style=&#34;color:#000;font-weight:bold&#34;&gt;||&lt;/span&gt; &lt;span style=&#34;color:#0086b3&#34;&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;go mod tidy
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#998;font-style:italic&#34;&gt;# Build the binary&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;go build -o nucamino ./cmd/nucamino
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After using nucamino for alignment, we were able to get reproducibility up to 99.8% (n=493)!!! Alright! I found out that I&amp;rsquo;ve been missing &lt;code&gt;del&lt;/code&gt; as part of NRTI mutation penalty. And the last one &lt;code&gt;DQ167215.1&lt;/code&gt;, I couldn&amp;rsquo;t really get it to work. I wonder if that has something to do with frameshift, which I haven&amp;rsquo;t figured out how nucamino handles that.&lt;/p&gt;




&lt;h2 id=&#34;thoughts&#34;&gt;Final Thoughts
  &lt;a href=&#34;#thoughts&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;p&gt;Hands down using &lt;code&gt;SierraPy&lt;/code&gt; or &lt;code&gt;Sierra&lt;/code&gt;-type app is the way to go, if we want reproducibility. There are options for local apps out there (see below). But learning to reproduce this really helps me understand the labeling of mutation and the way to get to susceptibility scoring better! It was a bumpy road, lots of trials, and lots more error and failure, but it ultimately it was a great learning experience. My respect for all these developers and researchers who have worked on this for years, really goes up! ❤️ It&amp;rsquo;s not easy at all! Even with the help of LLM!&lt;/p&gt;




&lt;h2 id=&#34;opportunities&#34;&gt;Opportunities for Improvement
  &lt;a href=&#34;#opportunities&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Need to rewrite the interpretation code into a function so that we don&amp;rsquo;t have to paste 3 different times or modify 3 different times on the &lt;code&gt;load_hivdb&lt;/code&gt; function.&lt;/li&gt;
&lt;li&gt;Learn the exact algorithm sierra uses for alignment, especially when it comes to deletion/insertion&lt;/li&gt;
&lt;li&gt;Look at &lt;code&gt;sierra-local&lt;/code&gt; to see how they use &lt;code&gt;nucamino&lt;/code&gt; to align and find mutations/indels&lt;/li&gt;
&lt;li&gt;prettify the results with &lt;code&gt;gt&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;include more ARTs&lt;/li&gt;
&lt;li&gt;look into Stanford HIVdb NGS analysis&lt;/li&gt;
&lt;/ul&gt;




&lt;h2 id=&#34;lessons&#34;&gt;Lessons Learnt
  &lt;a href=&#34;#lessons&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;
      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;
      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;
    &lt;/svg&gt;&lt;/a&gt;
&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;We did use genotype-phenotype 
&lt;a href=&#34;https://hivdb.stanford.edu/download/GenoPhenoDatasets/DRMcv.R&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;DRMcv&lt;/a&gt; but found the results not reproducible and also does not match the mutation interpretation results, probably user error.&lt;/li&gt;
&lt;li&gt;After multiple attempts to align seq, map to reference, then translate, just found out that &lt;code&gt;DECIPHER::AlignTranslation()&lt;/code&gt; does a better job at this! Don&amp;rsquo;t use &lt;code&gt;AlignSeq&lt;/code&gt; for aligning amino acids!&lt;/li&gt;
&lt;li&gt;learnt that the position of M184V, K65R etc are RT, PR, IN gene specific!&lt;/li&gt;
&lt;li&gt;learnt that Stanford HIVdb assumes mutation of different variants if &lt;code&gt;X&lt;/code&gt;. For example, if &lt;code&gt;M184X&lt;/code&gt; then it could be &lt;code&gt;M184I&lt;/code&gt; or &lt;code&gt;M184V&lt;/code&gt;, assume the worst case scenario and assign the max mutation score for that position.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you like this article:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;please feel free to send me a 
&lt;a href=&#34;https://www.kenkoonwong.com/blog/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;comment or visit my other blogs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;please feel free to follow me on 
&lt;a href=&#34;https://bsky.app/profile/kenkoonwong.bsky.social&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;BlueSky&lt;/a&gt;, 
&lt;a href=&#34;https://twitter.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;twitter&lt;/a&gt;, 
&lt;a href=&#34;https://github.com/kenkoonwong/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GitHub&lt;/a&gt; or 
&lt;a href=&#34;https://med-mastodon.com/@kenkoonwong&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Mastodon&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;if you would like collaborate please feel free to 
&lt;a href=&#34;https://www.kenkoonwong.com/contact/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;contact me&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
  </channel>
</rss>