{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Kotlin/JVM package dependencies + imports\n",
    "\n",
    "We pull in lets-plot with `%use`, which automatically sets up rich output\n",
    "\n",
    "Fuel is 'officially supported' but was causing some problems. Anyway we need to manually import jsoup and moshi, they aren't supported by Kotlin-Jupyter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "@file:Repository(\"https://repo1.maven.org/maven2/\")\n",
    "@file:DependsOn(\"com.github.kittinunf.fuel:fuel:2.2.3\")\n",
    "@file:DependsOn(\"com.github.kittinunf.fuel:fuel-coroutines:2.2.3\")\n",
    "@file:DependsOn(\"org.jsoup:jsoup:1.13.1\")\n",
    "@file:DependsOn(\"com.squareup.moshi:moshi-kotlin:1.9.3\")\n",
    "@file:DependsOn(\"de.mpicbg.scicomp:krangl:0.13\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import java.io.File\n",
    "import kotlinx.coroutines.*\n",
    "import com.github.kittinunf.result.Result\n",
    "import com.github.kittinunf.fuel.Fuel\n",
    "import com.github.kittinunf.fuel.core.FuelManager\n",
    "import com.github.kittinunf.fuel.coroutines.*\n",
    "import org.jsoup.Jsoup\n",
    "import org.jsoup.nodes.Document\n",
    "import com.squareup.moshi.*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "   <div id=\"JpM1PK\"></div>\n",
       "   <script type=\"text/javascript\" data-lets-plot-script=\"library\">\n",
       "       if(!window.letsPlotCallQueue) {\n",
       "           window.letsPlotCallQueue = [];\n",
       "       }; \n",
       "       window.letsPlotCall = function(f) {\n",
       "           window.letsPlotCallQueue.push(f);\n",
       "       };\n",
       "       (function() {\n",
       "           var script = document.createElement(\"script\");\n",
       "           script.type = \"text/javascript\";\n",
       "           script.src = \"https://dl.bintray.com/jetbrains/lets-plot/lets-plot-1.5.2.min.js\";\n",
       "           script.onload = function() {\n",
       "               window.letsPlotCall = function(f) {f();};\n",
       "               window.letsPlotCallQueue.forEach(function(f) {f();});\n",
       "               window.letsPlotCallQueue = [];\n",
       "               \n",
       "               \n",
       "           };\n",
       "           script.onerror = function(event) {\n",
       "               window.letsPlotCall = function(f) {};\n",
       "               window.letsPlotCallQueue = [];\n",
       "               var div = document.createElement(\"div\");\n",
       "               div.style.color = 'darkred';\n",
       "               div.textContent = 'Error loading Lets-Plot JS';\n",
       "               document.getElementById(\"JpM1PK\").appendChild(div);\n",
       "           };\n",
       "           var e = document.getElementById(\"JpM1PK\");\n",
       "           e.appendChild(script);\n",
       "       })();\n",
       "   </script>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%use lets-plot"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### define our classes\n",
    "Note that normally we could simply annotate each class with `@JsonClass` to tell Moshi to auto-build Json adapters.\n",
    "  I don't believe that's possible with Kotlin-Jupyter (happy to be wrong about this) so we will create adapters by hand"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "// @JsonClass(generateAdapter = true)\n",
    "data class ScoringPlay(\n",
    "    val quarter : Int,\n",
    "    val timeString : String,\n",
    "    val secondsElapsed : Int,\n",
    "    val team : String,\n",
    "    val detail : String,\n",
    "    val awayscore : Int,\n",
    "    val homescore : Int\n",
    ")\n",
    "\n",
    "// @JsonClass(generateAdapter = true)\n",
    "data class PFRWeek(val season : Int, val weeknumber : Int, val pfrURLs : List<String>)\n",
    "\n",
    "// @JsonClass(generateAdapter = true)\n",
    "data class PFRGame(\n",
    "    val season : Int,\n",
    "    val week : Int,\n",
    "    val pfrURL : String,\n",
    "    val hometeam : String, \n",
    "    val awayteam : String, \n",
    "    val homescore : Int,\n",
    "    val awayscore : Int,\n",
    "    val scoringplays : List<ScoringPlay>\n",
    ")\n",
    "\n",
    "// @JsonClass(generateAdapter = true)\n",
    "data class TeamRecord(\n",
    "    val season : Int,\n",
    "    val teamname : String,\n",
    "    val url : String,\n",
    "    val abbr : String, \n",
    "    val wins : Int,\n",
    "    val losses : Int,\n",
    "    val ties : Int,\n",
    "    val pointsFor : Int,\n",
    "    val pointsAgainst : Int,\n",
    "    val pfrOSRS : Float,\n",
    "    val pfrDSRS : Float\n",
    ")\n",
    "\n",
    "data class PFRData(val games : List<PFRGame>, val records : List<TeamRecord>)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "// in Kotlin-Jupyter, I don't think we can use codegen or reflection to auto-generate json adapters\n",
    "// easy enough to do it here manually\n",
    "val moshi : Moshi = Moshi.Builder().add(KotlinJsonAdapterFactory()).build()\n",
    "val adapterScoringPlay : JsonAdapter<ScoringPlay> = moshi.adapter(ScoringPlay::class.java)\n",
    "val adapterPFRGame : JsonAdapter<PFRGame> = moshi.adapter(PFRGame::class.java)\n",
    "val adapterPFRWeek : JsonAdapter<PFRWeek> = moshi.adapter(PFRWeek::class.java)\n",
    "val adapterTeamRecords : JsonAdapter<TeamRecord> = moshi.adapter(TeamRecord::class.java)\n",
    "val adapterPFRData : JsonAdapter<PFRData> = moshi.adapter(PFRData::class.java)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### define scraping functions\n",
    "\n",
    "weeks is just a conduit to get a list of all the game URLS -- we won't save it\n",
    "\n",
    "from weeks, we can get games, which we will persist\n",
    "\n",
    "we also need team records, which we also persist"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "// scraping NFL weeks from PFR - really we are only interested in the URL to each boxscore\n",
    "fun getWeeks(seasonRange : IntRange, weekRange : IntRange = IntRange(1,3)) : List<PFRWeek> {\n",
    "    \n",
    "    return seasonRange.fold(mutableListOf<PFRWeek>(), { accumulator , year ->\n",
    "    \n",
    "        println(\"season: ${year}\")\n",
    "        weekRange.map {w ->\n",
    "\n",
    "            println(\"- week: ${w}\")\n",
    "            val (_, _, result) = Fuel.get(\"https://www.pro-football-reference.com/years/${year}/week_${w}.htm\")\n",
    "                .responseString()\n",
    "\n",
    "            when (result) {\n",
    "                // we don't want to try to continue if there's been an error\n",
    "                is Result.Failure -> throw result.getException()  \n",
    "                is Result.Success -> {\n",
    "                    val pfrPage = result.get()\n",
    "                    val doc : Document = Jsoup.parse(pfrPage)\n",
    "                    val hrefs : List<String> = \n",
    "                        doc.select(\".game_summaries .game_summary .gamelink a\")\n",
    "                            .map {element -> element.attr(\"href\")}\n",
    "                    accumulator.add(PFRWeek(season = year, weeknumber = w, pfrURLs = hrefs))\n",
    "                }\n",
    "            }\n",
    "        }\n",
    "        accumulator\n",
    "    })\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "// scraping from PFR - we are getting the teams, final score, and scoring plays to be able to calculate all in-game point margins\n",
    "\n",
    "// this is an ASYNC function (using Kotlin coroutines)\n",
    "// note the `suspend fun`, `.awaitStringResponseResult()` and `coroutineScope`, `async` and `awaitAll`\n",
    "\n",
    "// this is the only function that takes long enough to be worth async-ing\n",
    "\n",
    "suspend fun getGames(weeks : List<PFRWeek>) : List<PFRGame> {\n",
    "    \n",
    "    val games = mutableListOf<PFRGame>()\n",
    "    \n",
    "    coroutineScope {\n",
    "        \n",
    "        weeks.forEach { week -> \n",
    "\n",
    "            week.pfrURLs.map { url ->\n",
    "                async {\n",
    "                    println(\"Game: season = ${week.season}, week = ${week.weeknumber}, url = ${url}\")\n",
    "                    val (_, _, result) = Fuel.get(\"https://www.pro-football-reference.com${url}\")\n",
    "                        .awaitStringResponseResult()\n",
    "                    \n",
    "                    when (result) {\n",
    "                        // we don't want to try to continue if there's been an error\n",
    "                        is Result.Failure -> throw result.getException()  \n",
    "                        is Result.Success -> {\n",
    "                            val pfrPage = result.get()\n",
    "                            val doc : Document = Jsoup.parse(pfrPage)\n",
    "                            val scoreboxes = doc.select(\".scorebox > div\")\n",
    "                            val scorerows = doc.select(\"table#scoring tbody tr\")\n",
    "                            var currentQuarter = 1 // PFR only \"announces\" the quarter once (not on every row) so we need a stateholder\n",
    "                            val scores = scorerows.map { r -> \n",
    "                                currentQuarter = r.select(\"th[data-stat='quarter']\").text().let {\n",
    "                                    when(it.trim()) {\n",
    "                                        \"OT\" -> 5\n",
    "                                        \"OT2\" -> 6\n",
    "                                        \"\" -> currentQuarter // when there's no value, we use the latest value\n",
    "                                        else -> it.toInt() // when a numerical value is present, (obviously) that's the new value\n",
    "                                    }\n",
    "                                }\n",
    "                                val secondsElapsed : Int = r.select(\"td[data-stat='time']\").text().split(\":\").let {\n",
    "                                    (currentQuarter - 1) * 900 + \n",
    "                                        (14 - it[0].toInt()) * 60 + \n",
    "                                            (60 - it[1].toInt())\n",
    "                                }\n",
    "                                ScoringPlay(\n",
    "                                    quarter = currentQuarter,\n",
    "                                    timeString = r.select(\"td[data-stat='time']\").text(),\n",
    "                                    secondsElapsed = secondsElapsed, // r.select(\"td[data-stat='time']\").text(),\n",
    "                                    team = r.select(\"td[data-stat='team']\").text(),\n",
    "                                    detail = r.select(\"td[data-stat='description']\").text(),\n",
    "                                    awayscore = r.select(\"td[data-stat='vis_team_score']\").text().toInt(),\n",
    "                                    homescore = r.select(\"td[data-stat='home_team_score']\").text().toInt()\n",
    "                                ) \n",
    "                            }\n",
    "                            games.add(PFRGame(\n",
    "                                        season = week.season,\n",
    "                                        week = week.weeknumber,\n",
    "                                        pfrURL = url,\n",
    "                                        hometeam = scoreboxes[0].select(\"strong a\").text(),\n",
    "                                        awayteam = scoreboxes[1].select(\"strong a\").text(),\n",
    "                                        homescore = scoreboxes[0].select(\".scores .score\").text().toInt(),\n",
    "                                        awayscore = scoreboxes[1].select(\".scores .score\").text().toInt(),\n",
    "                                        scoringplays = scores\n",
    "                                    )\n",
    "                            )\n",
    "                            println(\"new game added!\")\n",
    "                        }\n",
    "                    }\n",
    "                }\n",
    "            }.awaitAll()\n",
    "        }\n",
    "    \n",
    "    }\n",
    "    return games\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "// scraping final season records from PFR - we want to know the record for the teams with large deficits\n",
    "\n",
    "fun getTeamRecords(seasonRange : IntRange) : List<TeamRecord> {\n",
    "    \n",
    "    val teamRecords = mutableListOf<TeamRecord>()\n",
    "    \n",
    "    seasonRange.forEach { year ->\n",
    "    \n",
    "        println(\"season: ${year}\")\n",
    "        val (_, _, result) = Fuel.get(\"https://www.pro-football-reference.com/years/${year}/\").responseString()\n",
    "\n",
    "        when (result) {\n",
    "            // we don't want to try to continue if there's been an error\n",
    "            is Result.Failure -> throw result.getException()  \n",
    "            is Result.Success -> {\n",
    "                val pfrPage = result.get()\n",
    "                val doc : Document = Jsoup.parse(pfrPage)\n",
    "                val recordRows = doc.select(\".content_grid tbody tr:not([class*=thead])\")\n",
    "                recordRows.forEach { r -> \n",
    "                    println(r.select(\"th a\").text())\n",
    "                    teamRecords.add(TeamRecord(\n",
    "                        season = year,\n",
    "                        teamname = r.select(\"th a\").text(),\n",
    "                        abbr = r.select(\"th a\").attr(\"href\").substringBeforeLast(\"/\").substringAfterLast(\"/\"), \n",
    "                        url = r.select(\"th a\").attr(\"href\"),\n",
    "                        wins = r.select(\"td[data-stat='wins']\").text().toInt(),\n",
    "                        losses = r.select(\"td[data-stat='losses']\").text().toInt(),\n",
    "                        ties = r.select(\"td[data-stat='ties']\").text().let { if (it.isBlank()) 0 else it.toInt() },\n",
    "                        pointsFor = r.select(\"td[data-stat='points']\").text().toInt(),\n",
    "                        pointsAgainst = r.select(\"td[data-stat='points_opp']\").text().toInt(),\n",
    "                        pfrOSRS = r.select(\"td[data-stat='srs_offense']\").text().toFloat(),\n",
    "                        pfrDSRS = r.select(\"td[data-stat='srs_defense']\").text().toFloat(),\n",
    "                    )) \n",
    "                }\n",
    "            }\n",
    "        }\n",
    "    }\n",
    "    \n",
    "    return teamRecords\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### pull in data... load Json file if it exists, or perform scrape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "...loading previously-scraped data...\r\n"
     ]
    }
   ],
   "source": [
    "val dataFile : File = File(\"e:/pfrdata_async.json\")\n",
    "if (!dataFile.exists()) {\n",
    "    println(\"...scraping data from Pro-Football-Reference...\")\n",
    "    val pfrWeeks : List<PFRWeek> = getWeeks(seasonRange = IntRange(2015,2019), weekRange = IntRange(1,21))    \n",
    "    runBlocking {\n",
    "        val pfrGames : List<PFRGame> = getGames(pfrWeeks) // this is the only async function\n",
    "        val teamRecords : List<TeamRecord> = getTeamRecords(seasonRange = IntRange(2015,2019))\n",
    "        val pfrData = PFRData(games = pfrGames , records = teamRecords)\n",
    "        dataFile.writeText( adapterPFRData.toJson(pfrData) )        \n",
    "    }\n",
    "} else {\n",
    "    println(\"...loading previously-scraped data...\")\n",
    "}\n",
    "\n",
    "val (rawGames, teamRecords) = adapterPFRData.fromJson(dataFile.readText())!!\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### some basic understanding of the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1335"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rawGames.size  // count of all games in data set, including playoffs\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "160"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "teamRecords.size  // 5 seasons * 32 teams = 160 season records"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[ScoringPlay(quarter=1, timeString=9:28, secondsElapsed=332, team=Bears, detail=Robbie Gould 28 yard field goal, awayscore=0, homescore=3), ScoringPlay(quarter=1, timeString=0:43, secondsElapsed=857, team=Packers, detail=James Jones 13 yard pass from Aaron Rodgers (Mason Crosby kick), awayscore=7, homescore=3), ScoringPlay(quarter=2, timeString=7:49, secondsElapsed=1331, team=Bears, detail=Matt Forte 1 yard rush (Robbie Gould kick), awayscore=7, homescore=10), ScoringPlay(quarter=2, timeString=2:32, secondsElapsed=1648, team=Packers, detail=Mason Crosby 37 yard field goal, awayscore=10, homescore=10), ScoringPlay(quarter=2, timeString=0:08, secondsElapsed=1792, team=Bears, detail=Robbie Gould 50 yard field goal, awayscore=10, homescore=13), ScoringPlay(quarter=3, timeString=11:56, secondsElapsed=1984, team=Packers, detail=James Jones 1 yard pass from Aaron Rodgers (Mason Crosby kick), awayscore=17, homescore=13), ScoringPlay(quarter=3, timeString=4:57, secondsElapsed=2403, team=Bears, detail=Robbie Gould 44 yard field goal, awayscore=17, homescore=16), ScoringPlay(quarter=4, timeString=10:26, secondsElapsed=2974, team=Packers, detail=Randall Cobb 5 yard pass from Aaron Rodgers (Mason Crosby kick), awayscore=24, homescore=16), ScoringPlay(quarter=4, timeString=1:55, secondsElapsed=3485, team=Packers, detail=Eddie Lacy 2 yard rush (Mason Crosby kick), awayscore=31, homescore=16), ScoringPlay(quarter=4, timeString=0:34, secondsElapsed=3566, team=Bears, detail=Martellus Bennett 24 yard pass from Jay Cutler (Robbie Gould kick), awayscore=31, homescore=23)]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rawGames[2].scoringplays  // example of a list of scoring plays"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[PFRGame(season=2017, week=17, pfrURL=/boxscores/201712310phi.htm, hometeam=Philadelphia Eagles, awayteam=Dallas Cowboys, homescore=0, awayscore=6, scoringplays=[ScoringPlay(quarter=4, timeString=12:19, secondsElapsed=2861, team=Cowboys, detail=Brice Butler 20 yard pass from Dak Prescott (Dan Bailey kick failed), awayscore=6, homescore=0)]), PFRGame(season=2019, week=7, pfrURL=/boxscores/201910200was.htm, hometeam=Washington Redskins, awayteam=San Francisco 49ers, homescore=0, awayscore=9, scoringplays=[ScoringPlay(quarter=3, timeString=5:32, secondsElapsed=2368, team=49ers, detail=Robbie Gould 28 yard field goal, awayscore=3, homescore=0), ScoringPlay(quarter=4, timeString=9:06, secondsElapsed=3054, team=49ers, detail=Robbie Gould 22 yard field goal, awayscore=6, homescore=0), ScoringPlay(quarter=4, timeString=0:27, secondsElapsed=3573, team=49ers, detail=Robbie Gould 29 yard field goal, awayscore=9, homescore=0)])]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// games with no scores in 1st half... this will require handling NULL for first-half margins\n",
    "rawGames.filter { g -> g.scoringplays.filter { sp -> sp.secondsElapsed <= 1800 }.isEmpty() }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### define a class to be used for analysis. we create functions and calculated properties based on the raw data\n",
    "\n",
    "Our data is not entirely tabular, due to multiple scoring plays per game (thus multiple Margin classes). Also calculating winners/losers and matching up home/away with team names. So this class is a bit messy but necessary."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "class GameAnalysis(val source : PFRGame, val teamRecords : List<TeamRecord>) {\n",
    "    \n",
    "    val isPlayoff : Boolean = source.week >= 18\n",
    "    \n",
    "    inner class Margin(\n",
    "        val points : Int,\n",
    "        val timeElapsed : Int,\n",
    "        val leadingSide : String, // home or away (not team name)\n",
    "        val lostLead : Boolean? = null, // (not implemented yet) did the leading team ever lose the lead, even if they eventually won\n",
    "        val wonGame : Boolean,\n",
    "        \n",
    "        val leadingTeam : String = teamName(leadingSide),\n",
    "        val trailingTeam : String = teamName(opponent(leadingSide))\n",
    "    )\n",
    "    \n",
    "    fun leader(away : Int, home : Int) : String = when {\n",
    "        away > home -> \"away\" \n",
    "        away < home -> \"home\" \n",
    "        away == home -> \"tie\" \n",
    "        else -> \"uh-oh\"\n",
    "    }\n",
    "    \n",
    "    fun opponent(side : String) : String = when(side) {\n",
    "        \"away\" -> \"home\"\n",
    "        \"home\" -> \"away\"\n",
    "        \"tie\" -> \"tie\"\n",
    "        else -> \"uh-oh\"\n",
    "    }\n",
    "    \n",
    "    fun teamName(side : String) : String = when(side) {\n",
    "        \"away\" -> source.awayteam\n",
    "        \"home\" -> source.hometeam\n",
    "        \"tie\" -> \"tie\"\n",
    "        else -> \"uh-oh\"\n",
    "    }\n",
    "    \n",
    "    val winner : String = leader(source.awayscore, source.homescore)\n",
    "    val winningTeam : String = teamName(winner)\n",
    "    val losingTeam : String = teamName(opponent(winner))\n",
    "    \n",
    "    fun teamRecord(team : String, season : Int) : TeamRecord = \n",
    "        teamRecords.filter {r -> r.season == season && r.teamname == team }.first()\n",
    "    \n",
    "    val margins : List<Margin> = source.scoringplays\n",
    "        .map { p -> Margin(\n",
    "                points = Math.abs(p.awayscore - p.homescore),\n",
    "                timeElapsed = p.secondsElapsed,\n",
    "                leadingSide = leader(p.awayscore, p.homescore), // if (p.awayscore > p.homescore) \"away\" else \"home\",\n",
    "                wonGame = leader(p.awayscore, p.homescore) == this.winner\n",
    "            )}\n",
    "    \n",
    "    val largestPointDiff : Int = margins.map {m -> m.points}.maxOrNull() ?: 0\n",
    "    val largestFirstHalfPointDiff : Int = margins.filter {m -> m.timeElapsed <= 1800}.map {m -> m.points}.maxOrNull() ?: 0\n",
    "    \n",
    "    val largestMargin : Margin = margins.sortedByDescending { m -> m.points }.first()\n",
    "    val largestFirstHalfMargin : Margin? = margins.filter {m -> m.timeElapsed <= 1800}\n",
    "                                            .sortedByDescending { m -> m.points }.firstOrNull() // ?: Margin(0, 1800, \"tie\", null, false)\n",
    "    \n",
    "    val display = largestFirstHalfMargin?.let { fhm -> \"s${source.season}-w${source.week.toString().padStart(2, '0')} \" +\n",
    "        \"${fhm.trailingTeam} (${opponent(fhm.leadingSide)}) trailed by \" +\n",
    "        \"${fhm.points} to ${fhm.leadingTeam} \" + \n",
    "        \"and ${if (winner == fhm.leadingSide) \"lost\" else if (winner == \"tie\") \"tied\" else \"won\"} :: \" +\n",
    "        \"final record: ${teamRecord(fhm.trailingTeam, source.season).wins} wins\"\n",
    "    } ?: \"no first-half scoring\"\n",
    "     \n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### convert our \"raw\" scraped games to games ready for analysis (GameAnalysis objects)\n",
    "\n",
    "we also need to define the \"qualification\" criteria and filter our list of games"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "val allgames = rawGames.map {GameAnalysis(it, teamRecords)}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "120"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// let's define this externally\n",
    "val GameAnalysis.qualifies : Boolean \n",
    "    get() = this.margins.filter {m -> m.points >= 21 && m.timeElapsed <= 1800}.isNotEmpty() && !this.isPlayoff\n",
    "\n",
    "val qualifyingGames : MutableList<GameAnalysis> = allgames.filter {g -> g.qualifies}.toMutableList()\n",
    "qualifyingGames.size"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "s2015-w01 Houston Texans (home) trailed by 21 to Kansas City Chiefs and lost :: final record: 9 wins\n",
      "s2015-w01 Tampa Bay Buccaneers (home) trailed by 28 to Tennessee Titans and lost :: final record: 6 wins\n",
      "s2015-w01 Oakland Raiders (home) trailed by 24 to Cincinnati Bengals and lost :: final record: 7 wins\n",
      "s2015-w02 San Francisco 49ers (away) trailed by 26 to Pittsburgh Steelers and lost :: final record: 5 wins\n",
      "s2015-w02 Tennessee Titans (away) trailed by 21 to Cleveland Browns and lost :: final record: 3 wins\n",
      "s2015-w03 New York Jets (home) trailed by 24 to Philadelphia Eagles and lost :: final record: 10 wins\n",
      "s2015-w03 San Francisco 49ers (away) trailed by 28 to Arizona Cardinals and lost :: final record: 5 wins\n",
      "s2015-w03 Miami Dolphins (home) trailed by 27 to Buffalo Bills and lost :: final record: 6 wins\n",
      "s2015-w04 Houston Texans (away) trailed by 28 to Atlanta Falcons and lost :: final record: 9 wins\n",
      "s2015-w05 Detroit Lions (home) trailed by 21 to Arizona Cardinals and lost :: final record: 7 wins\n",
      "s2015-w07 Buffalo Bills (away) trailed by 24 to Jacksonville Jaguars and lost :: final record: 8 wins\n",
      "s2015-w07 Houston Texans (away) trailed by 41 to Miami Dolphins and lost :: final record: 9 wins\n",
      "s2015-w07 Washington Redskins (home) trailed by 24 to Tampa Bay Buccaneers and won :: final record: 9 wins\n",
      "s2015-w07 San Diego Chargers (home) trailed by 27 to Oakland Raiders and lost :: final record: 4 wins\n",
      "s2015-w08 Detroit Lions (away) trailed by 21 to Kansas City Chiefs and lost :: final record: 7 wins\n",
      "s2015-w13 Minnesota Vikings (home) trailed by 21 to Seattle Seahawks and lost :: final record: 11 wins\n",
      "s2015-w14 Tennessee Titans (away) trailed by 27 to New York Jets and lost :: final record: 3 wins\n",
      "s2015-w14 Atlanta Falcons (away) trailed by 28 to Carolina Panthers and lost :: final record: 8 wins\n",
      "s2015-w15 Buffalo Bills (away) trailed by 21 to Washington Redskins and lost :: final record: 8 wins\n",
      "s2015-w15 Tennessee Titans (away) trailed by 21 to New England Patriots and lost :: final record: 3 wins\n",
      "s2015-w15 San Francisco 49ers (home) trailed by 21 to Cincinnati Bengals and lost :: final record: 5 wins\n",
      "s2015-w15 Miami Dolphins (away) trailed by 23 to San Diego Chargers and lost :: final record: 6 wins\n",
      "s2015-w16 Jacksonville Jaguars (away) trailed by 24 to New Orleans Saints and lost :: final record: 5 wins\n",
      "s2015-w17 Dallas Cowboys (home) trailed by 24 to Washington Redskins and lost :: final record: 4 wins\n",
      "s2015-w17 Arizona Cardinals (home) trailed by 24 to Seattle Seahawks and lost :: final record: 13 wins\n",
      "s2015-w17 Tampa Bay Buccaneers (away) trailed by 21 to Carolina Panthers and lost :: final record: 6 wins\n",
      "s2016-w02 Miami Dolphins (away) trailed by 24 to New England Patriots and lost :: final record: 10 wins\n",
      "s2016-w02 Tampa Bay Buccaneers (away) trailed by 24 to Arizona Cardinals and lost :: final record: 9 wins\n",
      "s2016-w02 Jacksonville Jaguars (away) trailed by 21 to San Diego Chargers and lost :: final record: 3 wins\n",
      "s2016-w03 Detroit Lions (away) trailed by 28 to Green Bay Packers and lost :: final record: 9 wins\n",
      "s2016-w03 Chicago Bears (away) trailed by 21 to Dallas Cowboys and lost :: final record: 3 wins\n",
      "s2016-w03 San Francisco 49ers (away) trailed by 21 to Seattle Seahawks and lost :: final record: 2 wins\n",
      "s2016-w04 Kansas City Chiefs (away) trailed by 29 to Pittsburgh Steelers and lost :: final record: 12 wins\n",
      "s2016-w05 Houston Texans (away) trailed by 24 to Minnesota Vikings and lost :: final record: 9 wins\n",
      "s2016-w05 Cincinnati Bengals (away) trailed by 21 to Dallas Cowboys and lost :: final record: 6 wins\n",
      "s2016-w06 Carolina Panthers (away) trailed by 21 to New Orleans Saints and lost :: final record: 6 wins\n",
      "s2016-w08 Arizona Cardinals (away) trailed by 24 to Carolina Panthers and lost :: final record: 7 wins\n",
      "s2016-w08 Jacksonville Jaguars (away) trailed by 27 to Tennessee Titans and lost :: final record: 3 wins\n",
      "s2016-w10 Green Bay Packers (away) trailed by 25 to Tennessee Titans and lost :: final record: 10 wins\n",
      "s2016-w11 Tennessee Titans (away) trailed by 21 to Indianapolis Colts and lost :: final record: 9 wins\n",
      "s2016-w13 Miami Dolphins (away) trailed by 24 to Baltimore Ravens and lost :: final record: 10 wins\n",
      "s2016-w13 New York Jets (home) trailed by 21 to Indianapolis Colts and lost :: final record: 5 wins\n",
      "s2016-w14 San Diego Chargers (away) trailed by 23 to Carolina Panthers and lost :: final record: 5 wins\n",
      "s2016-w14 Los Angeles Rams (home) trailed by 21 to Atlanta Falcons and lost :: final record: 4 wins\n",
      "s2016-w15 Minnesota Vikings (home) trailed by 27 to Indianapolis Colts and lost :: final record: 8 wins\n",
      "s2016-w15 San Francisco 49ers (away) trailed by 21 to Atlanta Falcons and lost :: final record: 2 wins\n",
      "s2016-w16 New York Jets (away) trailed by 27 to New England Patriots and lost :: final record: 5 wins\n",
      "s2016-w17 New Orleans Saints (away) trailed by 22 to Atlanta Falcons and lost :: final record: 7 wins\n",
      "s2017-w01 Indianapolis Colts (away) trailed by 24 to Los Angeles Rams and lost :: final record: 4 wins\n",
      "s2017-w02 Chicago Bears (away) trailed by 26 to Tampa Bay Buccaneers and lost :: final record: 5 wins\n",
      "s2017-w03 Baltimore Ravens (away) trailed by 23 to Jacksonville Jaguars and lost :: final record: 9 wins\n",
      "s2017-w03 Cleveland Browns (away) trailed by 21 to Indianapolis Colts and lost :: final record: 0 wins\n",
      "s2017-w04 Chicago Bears (away) trailed by 21 to Green Bay Packers and lost :: final record: 5 wins\n",
      "s2017-w04 Cleveland Browns (home) trailed by 21 to Cincinnati Bengals and lost :: final record: 0 wins\n",
      "s2017-w04 Tennessee Titans (away) trailed by 21 to Houston Texans and lost :: final record: 9 wins\n",
      "s2017-w05 Arizona Cardinals (away) trailed by 21 to Philadelphia Eagles and lost :: final record: 8 wins\n",
      "s2017-w06 Cleveland Browns (away) trailed by 21 to Houston Texans and lost :: final record: 0 wins\n",
      "s2017-w06 Tampa Bay Buccaneers (away) trailed by 24 to Arizona Cardinals and lost :: final record: 5 wins\n",
      "s2017-w06 Detroit Lions (away) trailed by 21 to New Orleans Saints and lost :: final record: 9 wins\n",
      "s2017-w07 Arizona Cardinals (away) trailed by 23 to Los Angeles Rams and lost :: final record: 8 wins\n",
      "s2017-w09 Denver Broncos (away) trailed by 22 to Philadelphia Eagles and lost :: final record: 5 wins\n",
      "s2017-w11 Buffalo Bills (away) trailed by 30 to Los Angeles Chargers and lost :: final record: 9 wins\n",
      "s2017-w12 Chicago Bears (away) trailed by 24 to Philadelphia Eagles and lost :: final record: 5 wins\n",
      "s2017-w15 Cincinnati Bengals (away) trailed by 24 to Minnesota Vikings and lost :: final record: 7 wins\n",
      "s2017-w15 Houston Texans (away) trailed by 31 to Jacksonville Jaguars and lost :: final record: 4 wins\n",
      "s2017-w15 Seattle Seahawks (home) trailed by 34 to Los Angeles Rams and lost :: final record: 9 wins\n",
      "s2018-w01 Buffalo Bills (away) trailed by 26 to Baltimore Ravens and lost :: final record: 6 wins\n",
      "s2018-w01 Arizona Cardinals (home) trailed by 21 to Washington Redskins and lost :: final record: 3 wins\n",
      "s2018-w02 Baltimore Ravens (away) trailed by 21 to Cincinnati Bengals and lost :: final record: 10 wins\n",
      "s2018-w02 Pittsburgh Steelers (home) trailed by 21 to Kansas City Chiefs and lost :: final record: 9 wins\n",
      "s2018-w02 Buffalo Bills (home) trailed by 25 to Los Angeles Chargers and lost :: final record: 6 wins\n",
      "s2018-w03 Minnesota Vikings (home) trailed by 27 to Buffalo Bills and lost :: final record: 8 wins\n",
      "s2018-w03 San Francisco 49ers (away) trailed by 28 to Kansas City Chiefs and lost :: final record: 4 wins\n",
      "s2018-w04 Miami Dolphins (away) trailed by 24 to New England Patriots and lost :: final record: 7 wins\n",
      "s2018-w04 Tampa Bay Buccaneers (away) trailed by 35 to Chicago Bears and lost :: final record: 5 wins\n",
      "s2018-w05 Indianapolis Colts (away) trailed by 21 to New England Patriots and lost :: final record: 10 wins\n",
      "s2018-w05 Green Bay Packers (away) trailed by 24 to Detroit Lions and lost :: final record: 6 wins\n",
      "s2018-w06 Jacksonville Jaguars (away) trailed by 24 to Dallas Cowboys and lost :: final record: 5 wins\n",
      "s2018-w07 Buffalo Bills (away) trailed by 24 to Indianapolis Colts and lost :: final record: 6 wins\n",
      "s2018-w07 Arizona Cardinals (home) trailed by 32 to Denver Broncos and lost :: final record: 3 wins\n",
      "s2018-w07 San Francisco 49ers (home) trailed by 22 to Los Angeles Rams and lost :: final record: 4 wins\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "s2018-w08 Tampa Bay Buccaneers (away) trailed by 21 to Cincinnati Bengals and lost :: final record: 5 wins\n",
      "s2018-w09 Tampa Bay Buccaneers (away) trailed by 28 to Carolina Panthers and lost :: final record: 5 wins\n",
      "s2018-w09 Buffalo Bills (home) trailed by 28 to Chicago Bears and lost :: final record: 6 wins\n",
      "s2018-w09 Los Angeles Rams (away) trailed by 21 to New Orleans Saints and lost :: final record: 13 wins\n",
      "s2018-w10 Cincinnati Bengals (home) trailed by 28 to New Orleans Saints and lost :: final record: 6 wins\n",
      "s2018-w10 Detroit Lions (away) trailed by 26 to Chicago Bears and lost :: final record: 6 wins\n",
      "s2018-w10 New York Jets (home) trailed by 31 to Buffalo Bills and lost :: final record: 4 wins\n",
      "s2018-w11 Tennessee Titans (away) trailed by 24 to Indianapolis Colts and lost :: final record: 9 wins\n",
      "s2018-w12 Cincinnati Bengals (home) trailed by 28 to Cleveland Browns and lost :: final record: 6 wins\n",
      "s2018-w13 Cleveland Browns (away) trailed by 23 to Houston Texans and lost :: final record: 7 wins\n",
      "s2018-w14 Washington Redskins (home) trailed by 34 to New York Giants and lost :: final record: 7 wins\n",
      "s2018-w15 Miami Dolphins (away) trailed by 21 to Minnesota Vikings and lost :: final record: 7 wins\n",
      "s2018-w17 Green Bay Packers (home) trailed by 21 to Detroit Lions and lost :: final record: 6 wins\n",
      "s2018-w17 San Francisco 49ers (away) trailed by 25 to Los Angeles Rams and lost :: final record: 4 wins\n",
      "s2018-w17 New Orleans Saints (home) trailed by 23 to Carolina Panthers and lost :: final record: 13 wins\n",
      "s2018-w17 Oakland Raiders (away) trailed by 21 to Kansas City Chiefs and lost :: final record: 4 wins\n",
      "s2019-w01 Atlanta Falcons (away) trailed by 21 to Minnesota Vikings and lost :: final record: 7 wins\n",
      "s2019-w01 Miami Dolphins (home) trailed by 39 to Baltimore Ravens and lost :: final record: 5 wins\n",
      "s2019-w02 Minnesota Vikings (away) trailed by 21 to Green Bay Packers and lost :: final record: 10 wins\n",
      "s2019-w03 Oakland Raiders (away) trailed by 21 to Minnesota Vikings and lost :: final record: 7 wins\n",
      "s2019-w03 Washington Redskins (home) trailed by 28 to Chicago Bears and lost :: final record: 3 wins\n",
      "s2019-w04 Los Angeles Rams (home) trailed by 21 to Tampa Bay Buccaneers and lost :: final record: 9 wins\n",
      "s2019-w05 New York Jets (away) trailed by 21 to Philadelphia Eagles and lost :: final record: 7 wins\n",
      "s2019-w06 Philadelphia Eagles (away) trailed by 21 to Minnesota Vikings and lost :: final record: 9 wins\n",
      "s2019-w06 Los Angeles Chargers (home) trailed by 21 to Pittsburgh Steelers and lost :: final record: 5 wins\n",
      "s2019-w07 New York Jets (home) trailed by 24 to New England Patriots and lost :: final record: 7 wins\n",
      "s2019-w08 Atlanta Falcons (home) trailed by 24 to Seattle Seahawks and lost :: final record: 7 wins\n",
      "s2019-w08 Carolina Panthers (away) trailed by 24 to San Francisco 49ers and lost :: final record: 5 wins\n",
      "s2019-w10 Cincinnati Bengals (home) trailed by 25 to Baltimore Ravens and lost :: final record: 2 wins\n",
      "s2019-w12 Miami Dolphins (away) trailed by 28 to Cleveland Browns and lost :: final record: 5 wins\n",
      "s2019-w12 Green Bay Packers (away) trailed by 23 to San Francisco 49ers and lost :: final record: 13 wins\n",
      "s2019-w12 Los Angeles Rams (home) trailed by 22 to Baltimore Ravens and lost :: final record: 9 wins\n",
      "s2019-w13 Jacksonville Jaguars (home) trailed by 25 to Tampa Bay Buccaneers and lost :: final record: 6 wins\n",
      "s2019-w13 Oakland Raiders (away) trailed by 21 to Kansas City Chiefs and lost :: final record: 7 wins\n",
      "s2019-w14 Houston Texans (home) trailed by 28 to Denver Broncos and lost :: final record: 10 wins\n",
      "s2019-w14 Jacksonville Jaguars (home) trailed by 21 to Los Angeles Chargers and lost :: final record: 6 wins\n",
      "s2019-w15 Detroit Lions (home) trailed by 21 to Tampa Bay Buccaneers and lost :: final record: 3 wins\n",
      "s2019-w15 Los Angeles Rams (away) trailed by 21 to Dallas Cowboys and lost :: final record: 9 wins\n",
      "s2019-w17 Carolina Panthers (home) trailed by 35 to New Orleans Saints and lost :: final record: 5 wins\n"
     ]
    }
   ],
   "source": [
    "qualifyingGames.filter {g -> !g.isPlayoff}.forEach { qg ->\n",
    "    println(qg.display)\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### \"wrangle\" or rearrange data into digestible form for easy plotting\n",
    "\n",
    "We'd use data frames in Python (pandas) or R (dplyr). Kotlin's `krangl` is less mature and earier we stated our data isn't exactly tabular. Because all our data is in defined classes, we can use those for plots instead."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{0=1, 1=1, 2=2, 3=8, 4=9, 5=16, 6=16, 7=23, 8=12, 9=20, 10=17, 11=11, 12=10, 13=11, 14=2, 15=1, 16=0}"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// key is the number of wins (0-16), value is the number of times that was the final win total\n",
    "// 5 seasons * 32 teams = 160 season win totals\n",
    "\n",
    "val seasonWinTotals : Map<Int, Int> = IntRange(0,16).fold(mutableMapOf<Int, Int>(), { acc, i -> \n",
    "    acc[i] = teamRecords.filter {tr -> tr.wins == i}.size\n",
    "    acc\n",
    "})\n",
    "seasonWinTotals"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[9, 6, 7, 5, 3, 10, 5, 6, 9, 7, 8, 9, 9, 4, 7, 11, 3, 8, 8, 3, 5, 6, 5, 4, 13, 6, 10, 9, 3, 9, 3, 2, 12, 9, 6, 6, 7, 3, 10, 9, 10, 5, 5, 4, 8, 2, 5, 7, 4, 5, 9, 0, 5, 0, 9, 8, 0, 5, 9, 8, 5, 9, 5, 7, 4, 9, 6, 3, 10, 9, 6, 8, 4, 7, 5, 10, 6, 5, 6, 3, 4, 5, 5, 6, 13, 6, 6, 4, 9, 6, 7, 7, 7, 6, 4, 13, 4, 7, 5, 10, 7, 3, 9, 7, 9, 5, 7, 7, 5, 2, 5, 13, 9, 6, 7, 10, 6, 3, 9, 5]"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "// simple list of number of total wins by the trailing team in a qualifying game\n",
    "val trailerWinTotals = qualifyingGames.map { \n",
    "    // largestFirstHalfMargin isn't null because our filter ensured it was non-empty, so we can safely use `!!`\n",
    "    qg -> qg.teamRecord(qg.largestFirstHalfMargin!!.trailingTeam, qg.source.season).wins\n",
    "}\n",
    "trailerWinTotals"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "6.475"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "trailerWinTotals.average()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "   <div id=\"P5Ogd1\"></div>\n",
       "   <script type=\"text/javascript\" data-lets-plot-script=\"plot\">\n",
       "       (function() {\n",
       "           var plotSpec={\n",
       "'ggtitle':{\n",
       "'text':\"distribution of total season wins by large-deficit teams\"\n",
       "},\n",
       "'mapping':{\n",
       "'x':\"x\"\n",
       "},\n",
       "'data':{\n",
       "},\n",
       "'ggsize':{\n",
       "'width':640,\n",
       "'height':240\n",
       "},\n",
       "'kind':\"plot\",\n",
       "'scales':[{\n",
       "'aesthetic':\"x\",\n",
       "'name':\"season total wins\"\n",
       "},{\n",
       "'aesthetic':\"y\",\n",
       "'name':\"qualifying games\"\n",
       "},{\n",
       "'aesthetic':\"x\",\n",
       "'limits':[0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0,16.0]\n",
       "}],\n",
       "'layers':[{\n",
       "'stat':\"count\",\n",
       "'mapping':{\n",
       "},\n",
       "'data':{\n",
       "'..count..':[19.0,17.0,16.0,21.0,10.0,8.0,7.0,10.0,1.0,4.0,3.0,1.0,3.0],\n",
       "'x':[9.0,6.0,7.0,5.0,3.0,10.0,8.0,4.0,11.0,13.0,2.0,12.0,0.0]\n",
       "},\n",
       "'position':\"stack\",\n",
       "'geom':\"bar\"\n",
       "}]\n",
       "};\n",
       "           var plotContainer = document.getElementById(\"P5Ogd1\");\n",
       "           window.letsPlotCall(function() {{\n",
       "               LetsPlot.buildPlotFromProcessedSpecs(plotSpec, -1, -1, plotContainer);\n",
       "           }});\n",
       "       })();    \n",
       "   </script>"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "val p = lets_plot() { x = trailerWinTotals } + ggsize(640, 240)\n",
    "\n",
    "p + geom_bar(stat=Stat.count()) +\n",
    "    xlab(\"season total wins\") + ylab(\"qualifying games\") + \n",
    "    xlim(IntRange(0,16)) + ggtitle(\"distribution of total season wins by large-deficit teams\")\n",
    "    \n",
    "// note Stat.count() is the default for bar charts (geom_bar) so we can leave it out"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### quick timeout... did you notice 13-win teams qualified as large-deficit teams 4 times?\n",
    "\n",
    "is that really correct? maybe they were locked into a playoff spot and resting personnel?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[/boxscores/201601030crd.htm, /boxscores/201811040nor.htm, /boxscores/201812300nor.htm, /boxscores/201911240sfo.htm]"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "qualifyingGames.filter { qg -> qg.teamRecord(qg.largestFirstHalfMargin!!.trailingTeam, qg.source.season).wins == 13 }\n",
    "    .map { qg -> qg.source.pfrURL } "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- 2015 w17 : 9-6 Seattle led 13-2 Arizona 30-6 in first half and won 36-6. This was a Week 17 game, and QB Palmer sat in the 2nd half. Not really qualifies as a \"rest\" game.\n",
    "- 2018 w9 : 6-1 New Orleans led 8-0 LA Rams 35-14 in first half. LA tied the game 35-35 but ultimately lost 45-35.\n",
    "- 2018 w17 : 6-9 Carolina led 13-2 New Orleans 23-0 in first half. NO did not play Brees or Kamara. Definitely a \"rest\" game.\n",
    "- 2019 w12 : 9-1 San Fran led 8-2 Green Bay 23-0 at halftime. SF won 37-8. Legit blowout.\n",
    "\n",
    "So only 1 definite \"rest the starters game,\" the other 3 were matchups between playoff teams. Rams managed to tie their game. It's fair to exclude the NO-Carolina game but we should keep the others."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "119"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "qualifyingGames.removeIf { qg -> qg.source.pfrURL == \"/boxscores/201812300nor.htm\" }\n",
    "qualifyingGames.size"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "6.420168067226891"
     ]
    },
    {
     "data": {
      "text/html": [
       "   <div id=\"OUiebL\"></div>\n",
       "   <script type=\"text/javascript\" data-lets-plot-script=\"plot\">\n",
       "       (function() {\n",
       "           var plotSpec={\n",
       "'ggtitle':{\n",
       "'text':\"distribution of total season wins by large-deficit teams\"\n",
       "},\n",
       "'mapping':{\n",
       "'x':\"x\"\n",
       "},\n",
       "'data':{\n",
       "},\n",
       "'ggsize':{\n",
       "'width':640,\n",
       "'height':240\n",
       "},\n",
       "'kind':\"plot\",\n",
       "'scales':[{\n",
       "'aesthetic':\"x\",\n",
       "'name':\"season total wins\"\n",
       "},{\n",
       "'aesthetic':\"y\",\n",
       "'name':\"qualifying games\"\n",
       "},{\n",
       "'aesthetic':\"x\",\n",
       "'limits':[0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0,16.0]\n",
       "}],\n",
       "'layers':[{\n",
       "'stat':\"count\",\n",
       "'mapping':{\n",
       "},\n",
       "'data':{\n",
       "'..count..':[19.0,17.0,16.0,21.0,10.0,8.0,7.0,10.0,1.0,3.0,3.0,1.0,3.0],\n",
       "'x':[9.0,6.0,7.0,5.0,3.0,10.0,8.0,4.0,11.0,13.0,2.0,12.0,0.0]\n",
       "},\n",
       "'position':\"stack\",\n",
       "'geom':\"bar\"\n",
       "}]\n",
       "};\n",
       "           var plotContainer = document.getElementById(\"OUiebL\");\n",
       "           window.letsPlotCall(function() {{\n",
       "               LetsPlot.buildPlotFromProcessedSpecs(plotSpec, -1, -1, plotContainer);\n",
       "           }});\n",
       "       })();    \n",
       "   </script>"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "val trailerWinTotals = qualifyingGames.map { \n",
    "    // largestFirstHalfMargin isn't null because our filter ensured it was non-empty, so we can safely use `!!`\n",
    "    qg -> qg.teamRecord(qg.largestFirstHalfMargin!!.trailingTeam, qg.source.season).wins\n",
    "}\n",
    "print(trailerWinTotals.average())\n",
    "\n",
    "val p = lets_plot() { x = trailerWinTotals } + ggsize(640, 240)\n",
    "p + geom_bar() +\n",
    "    xlab(\"season total wins\") + ylab(\"qualifying games\") + \n",
    "    xlim(IntRange(0,16)) + ggtitle(\"distribution of total season wins by large-deficit teams\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "   <div id=\"c0Rlrl\"></div>\n",
       "   <script type=\"text/javascript\" data-lets-plot-script=\"plot\">\n",
       "       (function() {\n",
       "           var plotSpec={\n",
       "'ggtitle':{\n",
       "'text':\"distribution of total season wins by all teams, 2015-2019\"\n",
       "},\n",
       "'mapping':{\n",
       "'x':\"x\"\n",
       "},\n",
       "'data':{\n",
       "'x':[0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0,16.0]\n",
       "},\n",
       "'ggsize':{\n",
       "'width':640,\n",
       "'height':240\n",
       "},\n",
       "'kind':\"plot\",\n",
       "'scales':[{\n",
       "'aesthetic':\"x\",\n",
       "'name':\"season total wins\"\n",
       "},{\n",
       "'aesthetic':\"y\",\n",
       "'name':\"seasons\"\n",
       "},{\n",
       "'aesthetic':\"x\",\n",
       "'limits':[0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0,16.0]\n",
       "}],\n",
       "'layers':[{\n",
       "'stat':\"identity\",\n",
       "'mapping':{\n",
       "'y':\"y\"\n",
       "},\n",
       "'data':{\n",
       "'y':[1.0,1.0,2.0,8.0,9.0,16.0,16.0,23.0,12.0,20.0,17.0,11.0,10.0,11.0,2.0,1.0,0.0]\n",
       "},\n",
       "'position':\"stack\",\n",
       "'geom':\"bar\"\n",
       "}]\n",
       "};\n",
       "           var plotContainer = document.getElementById(\"c0Rlrl\");\n",
       "           window.letsPlotCall(function() {{\n",
       "               LetsPlot.buildPlotFromProcessedSpecs(plotSpec, -1, -1, plotContainer);\n",
       "           }});\n",
       "       })();    \n",
       "   </script>"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "val p = lets_plot() { x = seasonWinTotals.keys } + ggsize(640, 240)\n",
    "\n",
    "p + geom_bar(stat=Stat.identity) { y=seasonWinTotals.values } +\n",
    "    xlab(\"season total wins\") + ylab(\"seasons\") + \n",
    "    xlim(IntRange(0,16)) + ggtitle(\"distribution of total season wins by all teams, 2015-2019\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### now we're getting somewhere...\n",
    "\n",
    "We plotted how often teams finish with 0 through 16 wins. We also plotted how often our \"large-deficit\" teams finish with 0 through 16 wins. On casual observation, the large-deficit graph doesn't look TOO different, maybe just moved 1.5 games to the left (mean of 6.5 rather than 8).\n",
    "\n",
    "### but let's calculate probability of being in a large-deficit game (as the trailer), by # of season wins:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "val trailerWinCounts : Map<Int, Int> = IntRange(0,16).fold(mutableMapOf<Int, Int>(), { acc, i -> \n",
    "    acc[i] = trailerWinTotals.filter {twt -> twt == i}.size\n",
    "    acc\n",
    "})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "   <div id=\"f9NeFw\"></div>\n",
       "   <script type=\"text/javascript\" data-lets-plot-script=\"plot\">\n",
       "       (function() {\n",
       "           var plotSpec={\n",
       "'ggtitle':{\n",
       "'text':\"Probability of being a large-deficit team\"\n",
       "},\n",
       "'mapping':{\n",
       "'x':\"x\"\n",
       "},\n",
       "'data':{\n",
       "'x':[0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0,16.0]\n",
       "},\n",
       "'ggsize':{\n",
       "'width':640,\n",
       "'height':240\n",
       "},\n",
       "'kind':\"plot\",\n",
       "'scales':[{\n",
       "'aesthetic':\"x\",\n",
       "'name':\"season total wins\"\n",
       "},{\n",
       "'aesthetic':\"y\",\n",
       "'name':\"P(qualifying)\"\n",
       "},{\n",
       "'aesthetic':\"x\",\n",
       "'limits':[0.0,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,14.0,15.0,16.0]\n",
       "}],\n",
       "'layers':[{\n",
       "'stat':\"identity\",\n",
       "'mapping':{\n",
       "'y':\"y\"\n",
       "},\n",
       "'data':{\n",
       "'y':[0.1875,0.0,0.09375,0.078125,0.06944444444444445,0.08203125,0.06640625,0.043478260869565216,0.036458333333333336,0.059375,0.029411764705882353,0.005681818181818182,0.00625,0.017045454545454544,0.0,0.0,NaN]\n",
       "},\n",
       "'position':\"stack\",\n",
       "'geom':\"bar\"\n",
       "}]\n",
       "};\n",
       "           var plotContainer = document.getElementById(\"f9NeFw\");\n",
       "           window.letsPlotCall(function() {{\n",
       "               LetsPlot.buildPlotFromProcessedSpecs(plotSpec, -1, -1, plotContainer);\n",
       "           }});\n",
       "       })();    \n",
       "   </script>"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "val trailerProbabilities : Map<Int, Double> = IntRange(0,16).fold(mutableMapOf<Int, Double>(), { acc, i -> \n",
    "    acc[i] = trailerWinCounts[i]!!.div(16.0 * seasonWinTotals[i]!!) // we won't have nulls because both maps have same keys 0-16\n",
    "    acc\n",
    "})\n",
    "\n",
    "val p = lets_plot() { x = trailerProbabilities.keys } + ggsize(640, 240)\n",
    "\n",
    "p + geom_bar(stat=Stat.identity) { y = trailerProbabilities.values } +\n",
    "    xlab(\"season total wins\") + ylab(\"P(qualifying)\") + \n",
    "    xlim(IntRange(0,16)) + ggtitle(\"Probability of being a large-deficit team\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{0=0.1875, 1=0.0, 2=0.09375, 3=0.078125, 4=0.06944444444444445, 5=0.08203125, 6=0.06640625, 7=0.043478260869565216, 8=0.036458333333333336, 9=0.059375, 10=0.029411764705882353, 11=0.005681818181818182, 12=0.00625, 13=0.017045454545454544, 14=0.0, 15=0.0, 16=NaN}"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "trailerProbabilities"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### ... so, as you might expect, the likelihood of being in a large-deficit is massively larger for weak teams. Playoff teams have a 0.9% chance, mid-range teams (6-10 wins) have a 4.8% chance, bad teams (0-5 wins) have a 7.9% chance."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### That's a 9-to-1 ratio of bad teams to playoff teams, or better than 6-1 of all other teams to playoff teams"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Kotlin",
   "language": "kotlin",
   "name": "kotlin"
  },
  "language_info": {
   "codemirror_mode": "text/x-kotlin",
   "file_extension": ".kt",
   "mimetype": "text/x-kotlin",
   "name": "kotlin",
   "pygments_lexer": "kotlin",
   "version": "1.4.20-dev-2342"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}