The Ultimate Team Generator with the Wolfram Language
Author
Bob Sandheinrich
Title
The Ultimate Team Generator with the Wolfram Language
Description
Setting up teams, including player baggage requests, for an Ultimate Frisbee league using Wolfram graph and network tools. See the code to import and interpret the data, use the Dataset function, create a graph, group into teams, export the data.
Category
Essays, Posts & Presentations
Keywords
Computational Thinking, Data Analysis and Visualization, Other Application Areas, Wolfram Language
URL
http://www.notebookarchive.org/2019-08-0ymunme/
DOI
https://notebookarchive.org/2019-08-0ymunme
Date Added
2019-08-02
Date Last Modified
2019-08-02
File Size
2.88 megabytes
Supplements
Rights
Redistribution rights reserved
Download
Open in Wolfram Cloud
The Ultimate Team Generator with the Wolfram Language
The Ultimate Team Generator with the Wolfram Language
August 2, 2019
Bob Sandheinrich, Development Manager, Document & Media Systems
Every summer, I play in a recreational Ultimate Frisbee league—just “Ultimate” to those who play. It’s a fun, relaxed, coed league where I tend to win more friends than games.
The league is organized by volunteers, and one year, my friend and teammate Nate was volunteered to coordinate it. A couple weeks before the start of the season, Nate came to me with some desperation in his voice over making the teams. The league allows each player to request to play with up to eight other players—disparagingly referred to as their “baggage.” And Nate discovered that with over 100 players in a league, each one requesting a different combination of teammates, creating teams that would please everyone seemed to become more complicated by the minute.
Luckily for him, the Wolfram Language has a suite of graph and network tools for things like social media. I recognized that this seemingly overwhelming problem was actually a fairly simple graph problem. I asked Nate for the data, spent an evening working in a notebook and sent him the teams that night.
Using the Wolfram Language worked so well that—though it’s been years since I first helped out Nate, and the league coordinator has changed—I can count on an annual email volunteering me to make the teams again. And each year, I’ve been able to dig out my notebook and make teams, regularly adding improvements along the way.
Until Nate showed me his problem, I didn’t realize how tricky a situation this could be. Because baggage requests don’t have to be mutual, you can end up with chains of connected players that are larger than the acceptable size of a team. By just looking at Nate’s spreadsheet, it was nearly impossible to divine which baggage requests needed to be denied to make teams.
In addition to determining which baggage requests to decline, the process involves importing and interpreting datasets, grouping cores of players so that teams have similar metrics and exporting results for the league to distribute.
Some Notes
Some Notes
I’ve anonymized the data here, which was fun to do with Wolfram|Alpha. In only a couple lines of code, I replaced the all of the players’ names with notable people of the same gender from the Wolfram Knowledgebase. You can find the code to create this “dummy data” in the downloadable notebook for this post.
In the graph visualizations, I deliberately omitted the players’ names. I wanted to avoid the taint of giving myself an advantage, as I’m also playing in this league. Typically, I don’t know which team I am on until the very end. If any other players in the league are reading this and have doubts, allow my combined 2016–2018 win-loss record of 7–38 serve as definitive proof that if there is a bias, it is decidedly anti-Bob.
Importing the Data
Importing the Data
My first step is to grab the data from the league website. There are two sets of data to import: a list of registered players and a list of baggage requests.
Here I have stored anonymized copies of the player and baggage data as cloud objects:
In[]:=
$urls={"https://www.wolframcloud.com/obj/bobs/Ultimate2019DummyPlayers.html","https://www.wolframcloud.com/obj/bobs/Ultimate2019DummyBaggage.html"};
Player List
Player List
Since I started work on the Wolfram Data Repository a few years ago, I’ve learned a universal truth: any general, automated data importer will quickly fail to automatically import real data. In the real world, it’s all edge cases.
Naively optimistic nonetheless, I attempt to import the player data directly from the webpage using the automated tools in [url,"FullData"].
Import
The failure originates with two columns with checkboxes defined by custom CSS that are not properly captured by the HTML importer. Here is the code that failed:
In[]:=
autoplayers=Import[$urls[[1]],"FullData"][[2,2,1]];Short[autoplayers]
Out[]//Short=
{{1},{{05.05.19,3253,13166,3,,26,Female},152}}
Interpret the Data
Interpret the Data
In[]:=
interpretRawData[raw_]:=interpretRow/@getDataRows[raw]interpretRow[row_]:=MapIndexed[interpretValue[##]&,row]interpretValue[val_,{Key[k_]}]:=interpretValue[val,k]interpretValue[val_,"Entered"]:=DateObject[{val,{"Month",".","Day",".","YearShort"}}]interpretValue[val_,"Name"]:=ImportString[StringReplace[StringTrim[val],"\n"|"\t"""],"HTML"]interpretValue[val_,"age"|"pow"|"Mem ID"]:=ToExpression[val]interpretValue[val_,"exp"|"skl"|"ath"]:=ToExpression[StringDrop[val,1]]interpretValue[val_,"paid"|"mgr"]:=Interpreter["Boolean"][val]interpretValue[val_,_]:=valgetDataKeys[raw_]:=StringReplace[First@First[raw],Whitespace""]getDataRows[raw_]:=With[{k=getDataKeys[raw]},AssociationThread[k#]&/@raw[[2]]]
This shows that none of the players have paid. Since I paid, I know there’s a problem!
In[]:=
dataFromAutoImport=interpretRawData[autoplayers];
In[]:=
Lookup[dataFromAutoImport,"paid"]//Counts
Out[]=
False153
As in most real-world data problems, some manual data work is required. In the following code, I retrieve the data with a web request and then import it into WDF by manually parsing the HTML and formatting the values.
In[]:=
resp=URLRead[First@$urls]
Out[]=
HTTPResponse
|
In[]:=
html=resp["Body"];
The response contains a raw HTML document:
In[]:=
Snippet[html,4]
Out[]=
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html xmlns="http://www.w3.org/TR/REC-html40"><head>
To parse the HTML directly, I need more utilities.
In[]:=
getFirstTable[html_]:=First[StringCases[html,"<table"~~Shortest[___]~~"</table"]]getTableRows[table_]:=StringCases[table,"<tr"~~Shortest[___]~~"</tr"]getTableKeys[rows_]:=StringReplace[StringCases[First[rows],"<th"~~Shortest[___]~~">"~~k:Shortest[__]~~"</th"k],"<"~~Shortest[___]~~">"""]getRowValues[row_]:=StringReplace[StringCases[row,"<td"~~Shortest[___]~~">"~~k:Shortest[___]~~"</td"k],"<"~~cont:Shortest[___]~~">":>If[StringFreeQ[cont,"check"],"","True"]]importPlayerTable[html_]:=With[{rows=getTableRows[getFirstTable[html]]},With[{keys=getTableKeys[rows]},interpretRow[AssociationThread[keys->getRowValues[#]]]&/@Rest[rows]]]
In[]:=
importeddata=importPlayerTable[html];
Now, "paid" shows the correct values.
In[]:=
Lookup[importeddata,"paid"]//Counts
Out[]=
True143,False10
Dataset
Dataset
I like working with . It makes it easy to query the data as well as provides a nice visualization in a table.
Dataset
In[]:=
playerdata=Dataset[importeddata]
Out[]=
|
In case manually parsing HTML was not messy enough, the chummy Ultimate community of St. Louis has another trick. Players know that instead of properly entering their requests into the registration form, they can just email the league coordinator and tell them what they want. To help sort out these emails, I made a utility function for finding players by name or ID.
In[]:=
findPlayer[str_String]:=playerdata[Select[StringContainsQ[#Name,str,IgnoreCaseTrue]&]]findPlayer[n_Integer]:=playerdata[Select[#["Mem ID"]===n&]]playerPosition[n_Integer]:=First@Flatten@FirstPosition[playerdata,n]
For example, Maria Garcia accidentally registered as a manager (or captain). Then she emailed to say she does not want that responsibility.
In[]:=
findPlayer["Garcia"]
Out[]=
|
In[]:=
p=playerPosition[2924]
Out[]=
14
In[]:=
importeddata[[p,"mgr"]]=False
Out[]=
False
After several other manual adjustments, I recreate the dataset.
In[]:=
playerdata=Dataset[importeddata];
Then I get a list of the unique IDs that we will use to create a graph.
In[]:=
ids=Normal[playerdata[All,"Mem ID"]];
In[]:=
Length[ids]
Out[]=
153
Baggage
Baggage
For the baggage data, my optimism pays off; the automatic importing works!
In[]:=
rawbaggage=Import[$urls[[2]],"Data"];
In[]:=
Length[rawbaggage]
Out[]=
408
The data is very simple, just pairs of ID numbers.
In[]:=
Short[rawbaggage]
Out[]//Short=
{{196,190},{196,3187},{207,723},{207,64},{207,2318},398,{4638,3966},{4641,4631},{4641,4633},{4641,4630},{4644,3778}}
In[]:=
edges=DirectedEdge@@@rawbaggage;
Create a Graph Including Baggage
Create a Graph Including Baggage
Now we’ve reached the interesting part, where the visualization and manipulation makes a hard problem easy.
Graph
I start by creating one large graph, where each vertex is a player and each edge is a baggage request. Because baggage requests need not be mutual, it’s a directed graph.
To add more information to the visualization, I style the graph using green for men, orange for women and large vertices for captains. The slightly darker colors represent people who have not yet paid. I label the vertices with the unique member ID ("Mem ID") property.
Initialize Symbols
Initialize Symbols
In[]:=
$numTeams=8;coreIDs={};teams=teamlist=List/@Range[$numTeams];selected=0;
Graph Styling Tools
Graph Styling Tools
In[]:=
teamSummary[tn_]:=teamsummary[tn, teams[[tn]]]teamsummary[tn_, team_]:=With[{rows=playerdata[Select[MemberQ[team,#["Mem ID"]]&]]},Style["Team "<>ToString[tn]<>": "<>ToString[Length[team]]<>" Players\n"<>TextString@Normal@Counts[rows[All,"Gender"]]<>"\nMedian Age = "<>ToString[N@Median[rows[All,"age"]]]<>"\nMean Power = "<>ToString[N@Mean[rows[All,"pow"]]], 14]]Clear[$genderColors,$bigCaptains,$nameLabels,$idLabels];$idLabels:=($idLabels=Thread[idsids]);$genderColors:=($genderColors=Normal[playerdata[All,#["Mem ID"]If[!TrueQ[#paid],Darker,Identity]@Switch[#Gender,"Male",Green,"Female",Orange,_,Print["g"#Gender];Black]&]])$bigCaptains:=($bigCaptains=Normal[playerdata[All,#["Mem ID"]If[TrueQ[#mgr],1,.4]&]])$nameLabels:=($nameLabels=Normal[playerdata[All,#["Mem ID"]StringTake[#Name,UpTo[25]]&]])
Immediately there’s a big problem: about a third of the eight-team league is connected in a continuous “baggage chain.” This is pretty typical of my yearly experience. In the worst year so far, over half the players were connected to each other.
In[]:=
originalgraph=Graph[ids,edges,VertexLabels$idLabels,VertexStyle$genderColors,VertexSize$bigCaptains]
Out[]=
Determine Desired Team Shape
Determine Desired Team Shape
Before I start cutting those groups apart, I compute some quick statistics about how the average team should look.
Number of Players
Number of Players
In[]:=
N[Length[ids]/$numTeams]
Out[]=
19.125
Number of Women per Team
Number of Women per Team
In[]:=
N@Length[playerdata[Select[#Gender"Female"&]]]/$numTeams
Out[]=
4.875
Average Age and Power
Average Age and Power
In[]:=
{N@playerdata[Mean,"age"],playerdata[Median,"age"]}
Out[]=
{30.7451,29}
In[]:=
N@playerdata[Mean,"pow"]
Out[]=
55.1961
Baggage-Breaking Chains
Baggage-Breaking Chains
These simple statistics give me an idea of what size group I can allow to stay. To split the large graph into connected groups that I call “cores,” I use .
WeaklyConnectedGraphComponents
In[]:=
connectedgraphs=WeaklyConnectedGraphComponents[originalgraph];
In[]:=
VertexCount/@connectedgraphs
Out[]=
{45,23,11,10,9,7,7,6,6,4,4,3,3,2,2,1,1,1,1,1,1,1,1,1,1,1}
In[]:=
subgraph=First@connectedgraphs
Out[]=
Sometimes it’s hard to pick out the exact edges from looking at the graph. That’s where