Saturday, January 23, 2016

My mom uses math to win, at scrabble, and words with friends.

As you know, I blog about the users of untrained math on this blog, usually myself. I didn't think I would have a post about someone else this early on. Much less my mother.

My mother uses math to win at Words With Friends, and Scrabble. My mother occasionally asks me to play facebook games with her, and sometimes I do. Well she has had a history of winning in the game, so I thought I would take it up a notch, and just try to cream her using an anagram search, and just sort of eyeballing what word was the best score for me to place from there. Granted not a holistic approach, but I thought the edge of having the anagram tool would guarantee me a win. In the two games we played simultaneously she beat me by over 100 points in both. I asked her how she did it, but she just laughed at me. I later admitted using the anagram tool and she said someone else who she had beat had admitted to using a tool that did some kind of screen reading and then decided your play for you. My mom beat whoever's algorithm that is.

So to you unknown programmer, let me just say:



MY MOM, CAN BEAT YOUR MOM, AT WORDS WITH FRIENDS, WITH MATH



I know I should say that programmer's algorithm but saying mom felt you know, better.

I've been staying with them this past week. And while I didn't continue to pester her about it the seed was planted. And sometimes at night, she'll come looking for me, to ask me how to spell things. The seed kept getting watered. I saw her today writing in a list, of words that looked to be written in rot13. She was playing words with friends, so I knew it was part of her strategy so I asked her about it and this was her run down.

1. You play the board for score, make sure you hit as many of the double word, double letter, triple letter etc. tiles as you can.

2. She believes that there are essentially 12 letters that most words begin with so she compiled lists of words with those letters, seperated by word length and used that to pick her words given the situation.

3.She also composed pages, for J and Q, because they were high point letters.(Presumably x as well)

4. Never use a vowel as a starting letter

There's no reason to talk about number one it's just a good general strategy hint. Number two is interesting, because she used the dictionary in words with friends and scrabble, to create a tertiary brute force usable dictionary. This is constructed using the scrabble/words with friends dictionary, using something like combinatorics, but with more assumptions about patterns. And three is again just good strategy if your already building a dictionary why not supplement with something that has a high score section that isn't very dense. Diminishing returns are then being accounted for.

I'm going to provide some figures so we can figure if my mom eyeballing it about the letters is close, as in do you suffer diminishing returns, after 12 letters.

Here is a sorted list of wordcounts, from the words with friends dictionary:

 173122 total
  19004 swords
  16236 cwords
  14612 pwords
  10479 awords
  10173 rwords
  10163 dwords
   9637 mwords
   9296 bwords
   8758 twords
   7078 ewords
   6989 iwords
   6746 fwords
   6312 hwords
   5854 owords
   5595 gwords
   5079 uwords
   5073 lwords
   4381 nwords
   3708 wwords
   2749 vwords
   1734 kwords
   1386 jwords
    839 qwords
    556 ywords
    549 zwords
    136 xwords

Here it is without vowels:

 173122 total
  19004 swords
  16236 cwords
  14612 pwords
  10173 rwords
  10163 dwords
   9637 mwords
   9296 bwordst
   8758 twords
   6746 fwords
   6312 hwords
   5595 gwords
   5073 lwords
   4381 nwords
   3708 wwords
   2749 vwords
   1734 kwords
   1386 jwords
    839 qwords
    556 ywords
    549 zwords
    136 xwords

I'd say the diminishing returns probably start at fwords, so s words to t words is 8 rather than 12. I'm not going to name the letters she used cause that's still her secret sauce, because they aren't including all the most popular, it's a strange mix and I don't know how it was picked. Oh and as a bonus she has a list of compound words, words that are generally longer than a single play, so you have a premade list of just words you might add on too, that someone else played.

Keep in mind my sorting didn't include word density either.

In closing, my mom's strategy probably is not perfect, I do think there is something to be said for human pattern recognition, that's probably helping her skip some math steps I'm not listing, or she just has some miscalculations.  That doesn't change the fact, my mom can beat your mom at scrabble using math.

Also I'll be putting the one liners used to put these lists together today at the bottom of my end of the day blog post at Bsdpunk blog.

Tuesday, January 5, 2016

NFL and NHL DFS ( DraftKings ) Combinatorics part 2

Part 1 I fixed the NFL script, so the FLEX player is accounted for.
Here's the script:

dk <- read.csv("nflredux")
dk <- lapply(split(dk, dk$Position), function(x) x[sample(15), ])

dk <- dk[c("G","W","C","D","U")]
15*choose(15,3)*choose(15,2)*choose(4,2)*4

rows <- list(t(1:15), combn(15,3), combn(15,2), combn(4,2), t(1:4))

dims <- sapply(rows, NCOL)
inds <- expand.grid(mapply(`:`, 1, dims))

dim(inds)

extract <- function(ind) {
    g <- inds[ind,]
    do.call(rbind, lapply(1:5, function(i) dk[[i]][rows[[i]][,g[[i]]], ]))
}

extract(1)

win <- c(0, 0, 0)
for(i in 1:17000000)
{
    extracted <- extract(i)
    #print(i)
    #print(sum(extracted$Price))
    if(sum(extracted$Price) < 50000){
        if(win[3] < sum(extracted$Points)){
                #print(sum(extracted$Points))
                win <- c(i, sum(extracted$Price), sum(extracted$Points))
                print(win)
print(extracted)
        }
    }
}

print(win)


Also when I let the hockey script run for less than a few hours the team it gave me did very poorly...like dead last poorly. I would share the results but one of the players had an injury and I subbed him and the UTIL player out based on a best guess. I mean is this telling of this particular use of math to solve this particular problem. Not using the full data set. Not using the best results. Then subbing players out. I don't know, probably not. I'm going to add some filtering and make it run parallel so the results finish faster. As well as take a look at implementing Normal Distribution to determine best outcomes and weight towards those, regarding strong defense and aggression. And again I don't want to steal credit for the most part, currently this is a script from here.
I'll have another update on DraftKings / DFS math on Friday, but I'll have other posts before then as well.

Monday, January 4, 2016

Applying combinatorics to dfs ( draftkings ) NFL and NHL lineups, using GNU tools and the R programming Language (Part 1)

Fair warning, I am a lazy developer and a bit of a crazy person. If you want to know the extent, I recommend having a look at the “about me” section. If you are comfortable with that, please proceed. But you've been warned!

This article is about using bad math and poorly conceived ideas to make a guess for Daily Fantasy Sports betting, e.g. DraftKings.

Recently someone gifted me a few dollars worth of plays on DraftKings. Being unemployed with time on my hands, and knowing nothing about sports, I decided to apply my brand of math to this. I've seen nearly, every episode of Numb3rs...sort of, so I was like, combinatorics...that's the way to go.

Combinatorics is a branch of mathematics concerning the study of "finite and discrete structures."

Based off of my 8th grade understanding of mathematics, that means combinatorics is essentially the mixing of lists. In this case it's mixing lists to find the most points for the salary cap on DraftKings. You know using math to do something noble.

I'm not very good at many things, but cutting up a CSV (comma seperated values, i.e. a text file with a consistent format) is an exception, and DraftKings let's you export a CSV file with the basic stats, e.g. player position, player name, average points per game, salary for purchase in the lineup, etc.


First I cut up the CSV with good ol' awk and sed:

awk -F "\"*,\"*" '{print $1,"," $2,"," $3,"," $5}' DKSalaries.csv | sed "s/\"QB /QB/g" | sed "s/\"RB /RB/g" | sed "s/\"WR /WR/g" | sed "s/\"TE /TE/g" | sed "s/\"DST /DST/g" > nflredux


And then, to add the FLEX players:

cat nflredux | grep -P "^WR|^QB|^RB|^TE" | sed "s/^[Q|R|T|W][B|E|R]/FLEX/" >> nflredux
I did a similar thing to the NHL CSV's. If for some odd reason you need these one liners, hit me up on twitter @bsdpunk.
So I altered an R language script, the one on Stack Overflow assumes a sample size of 4, mine was expanded so it could actually find real matches with a larger sample size. To get true results you need something more than a 10 year old laptop; in fact you would probably be best off rewriting it for parallel processing and running it on a Parallella.
I believe the current version of the R programming language tries to take advantage of multi-core. Which is evident in the “lapply” function running very quickly. The slowness is due to my shitty for loop, and I am not even sure how you would parallellize that, maybe rabbitmq... I'll do that in the future if my blog generates revenue (which means I will probably never do that)
dk <- read.csv("nflredux")

dk <- lapply(split(dk, dk$Position), function(x) x[sample(15), ])



dk <- dk[c("QB","WR","RB","TE","DST","FLEX")]

15*choose(10,3)*choose(10,2)*15*4*4

rows <- list(t(1:15), combn(10,3), combn(10,2),t(1:15),t(1:4),t(1:4))

dims <- sapply(rows, NCOL)

inds <- expand.grid(mapply(`:`, 1, dims))

dim(inds)

extract <- function(ind) {

    g <- inds[ind,]

    do.call(rbind, lapply(1:5, function(i) dk[[i]][rows[[i]][,g[[i]]], ]))

}

win <- c(0, 0, 0)

for(i in 1:17000000)

{

    extracted <- extract(i)

    if(sum(extracted$Price) < 50000){

        if(win[3] < sum(extracted$Points)){

                win <- c(i, sum(extracted$Price), sum(extracted$Points))

                print(win)

  print(extracted)

        }

    }

}



print(win)


Here is my NHL script:
dk <- read.csv("thisone")
dk <- lapply(split(dk, dk$Position), function(x) x[sample(15), ])

dk <- dk[c("G","W","C","D","U")]
15*choose(15,3)*choose(15,2)*choose(4,2)*4

rows <- list(t(1:15), combn(15,3), combn(15,2), combn(4,2), t(1:4))

dims <- sapply(rows, NCOL)
inds <- expand.grid(mapply(`:`, 1, dims))

dim(inds)

extract <- function(ind) {
    g <- inds[ind,]
    do.call(rbind, lapply(1:5, function(i) dk[[i]][rows[[i]][,g[[i]]], ]))
}

extract(1)

win <- c(0, 0, 0)
for(i in 1:17000000)
{
    extracted <- extract(i)
    #print(i)
    #print(sum(extracted$Price))
    if(sum(extracted$Price) < 50000){
        if(win[3] < sum(extracted$Points)){
                #print(sum(extracted$Points))
                win <- c(i, sum(extracted$Price), sum(extracted$Points))
                print(win)
  print(extracted)
        }
    }
}

print(win)

As I said, these probably won't give you the best results, because on my shitty laptop the scripts would take far too long to run; in fact it would take longer than the amount of time from when the lineups are posted until when the game goes live. So even if I could run my scripts in that time with Parallela processing on my current hardware I still couldn't run the complete problem set. Which I probably can do...with you know, revenue.
I could do a lot of pre-calculation on the players before the game, sort of like an incremental development thing. And I actually have access to a lot of older and newer NHL stats despite most sports API's charging 360 dollars a month! THE FUCK? INFORMATION IS FREE...WE ALREADY WON THIS WAR! Luckily I know an autistic man who loves hockey and can tell me what the weather was like when the Predators stomped the Rangers.
So there's a lot more of particular type of number crunching. And I think that there's a lot of Game Theory you could use to get these results much better. Like weighing teams that are highly offensive higher, and increasing the weight if they are playing a team not considered to have a good defense. I don't know how you would mathematically determine who is highly offensive or who has a lesser defense, if someone knows a way to calculate that, or has an idea shoot it at me. You could also use something like Normal Distribution, to show a bell curve of a given player's performance. And make decisions on that.
Of course combinatorics doesn't take in the possibility of rigged/cheating matches, or players not playing to their full potential. Though it could be argued that if they play this way consistently it is reflective in the stats.
Ok, so all of this is PART 1, i.e. written before anything actually was played. I'll update you with some stats in PART 2.

***The NFL script is actually broken I will address this in a future post.

Sunday, January 3, 2016

What is Outsider Math?

Outsider art is defined as:
art created outside the boundaries of official culture
Outsider art and music is typically committed by those on the fringe of culture. By the disenfranchised and mentally ill. If you want to know more about outsider music I suggest reading Songs in the Key of Z, and for outsider art I suggest...well I haven't really read a good book on it, but the wikipedia article is fascinating. This blog will be a representation of math that is subjective, which is to say, wrong. Math is objective, that's what makes it math. As a certified crazy person(to quote the Wordburglar, I got receipts and shit) I will be presenting math to you as though that isn't true. I am not an expert on math, I don't know anything about statistics or data analysis. I am not a very good software developer either, I'm sure any scripts I write for this site will be riddled with problems.

So what does qualify me to write this blog:

I've seen every episode of Numb3rs...sort of.* I watched the first three episodes of The Code. I've been through like at least half of the Algebra section on Kahn Academy. And I love math.

And I just want you guys to know. I didn't always love math. In fact many teachers actively convinced me that I was not only bad at it, but I shouldn't pursue it. If just one teacher had said, "You can suck at arithmetic, but be good at math". My entire life would be different. So I became a linux admin. Which I also love. Using the GNU toolset and linux are my core competency. But this amateur brings math to you. Hopefully in an entertaining way that you recognize as perhaps false or satire, when it is.

OUTSIDER MATH, BRINGS IT.

*If I'm doing something as mindless as TV I usually accompany it with adult beverages.

**There are things I'm bad ass at. They just mostly involve linux and drinking.