Fishalytics

A (failed) experiment in data analysis as a behaviour modification tool

Jefferey Cave
11 min readJul 8, 2022

About 12 years ago, I gave up my 100 hour work weeks as a software developer, moved to the other side of Canada, and bought a small-holding farm. One of the first things I did when I moved to the new region was to buy a fishing license and was shocked when they handed me an official “catch form” with my license.

In Nova Scotia, when you have a recreational fishing license, you are expected to keep track of the number of fish you catch, the region you caught them in, as well as what species they are.

For a year.

I was struck by how inefficient this was. It was highly unlikely that I was going to remember every fish I caught for a year, it was also unlikely that I was going to be able to remember where I had put the fishing scorecard that I was required to send in at the end of the year.

A screen shot showing fishing licenses over the years. While I primarily fish in Canada, an acquaintance did give me written permission to fish his private fishery in England. A fond memory to keep track of.

This seemed like a simple data entry form project that I thought I might even be able to sell to the provincial government. At its simplest, the idea was that I could write an application that allowed fisherman to enter their fishing license information, and record their fish as they caught them. At the end of the year, the software could just print the score card for you.

Over the next 10 years, this tool became the focus of much of my internal pondering regarding system development and the ability of software to affect social change.

I even went so far as to get my wife to design a logo for the project. A fish, over a grid, jumping from a stream (which is actually a the curve of a graph from the app).

From Simple Record Keeping to Analysis

While the initial concept of the software was to simply as a record keeping tool for myself, it was not long before I started to think bigger.

  1. If I built the tool, I may as well make it more widely available.
  2. If I wanted others to use the tool, I needed to offer them more than just a fishing journal.

I had just hit on my first element of social design: you must offer something to the individuals you want to harvest data from. Like one of the major social platform, I could harvest user data and sell it on, but in order to do that, I first needed to offer something in return.

There were two “value add” features I could think of right away.

  1. The social aspect. Keeping pictures and memories of catching fish is fun. Sharing fishing tales, in near real-time, would be a fun way for fishermen to engage with friends, and a great way to record them for future memory.
  2. Basic analysis. Knowing where you caught the last fish gives you some idea where to catch future fish, and using aggregate data of many fishermen, I could actually offer solid, unbiased, advice on where to catch the next fish.

That first idea that people could share was an important one, Facebook was young enough that competitive services were still reasonable in niche fields. Unfortunately, social platform development was really outside my domain of expertise. The second option spoke to strengths I had, and it occurred to me that if I could develop a successful predictive product, partnering with a social partner could be performed later. Another benefit to focusing on the analysis was that I was the sole user at this point: it is almost impossible to build a social platform around a single individual, however it is possible for a single individual to gather multiple data points for an analysis.

The first focus was set: predict what leads to catching more fish.

Planning the Data Gathering

In order to suggest to users what they should do to catch mroe fish, I first had to think of the variables that would impact the ability to catch fish. The best place to determine this was at my local lake, staring out over the water.

  1. What about my fishing (behaviour) could affect my catching fish (outcome)?
  2. Also, what could affect my ability to catch fish, but was difficult to measure?
  3. Lastly, what was easy to measure, but probably had little to do with my catching fish (red herrings)?

In brainstorming these things, I came up with several things that had a high impact on my catch rate and were reasonably easy to measure.

  • Location
  • Time of Day (solar declination)
  • Temperature (air and water)
  • Solar penetration (cloud cover)
  • Covering vegetation
  • Lure
  • Time spent fishing
  • Time of Year

Unfortunately, of these, only two seemed easy enough that I could expect the average fisherman (ie. me) to collect them regularly. Space-Time is easily captured on a phone, so anything involving those metrics is easy to capture: location, time of day, time of year, and solar declination.

These two items can be measured easily through smart-phone GPS loggin. By marking the start of a fishing trip, and having the application continuously log the position of the person fishing we can get a sense of how long they stood in a given location with a line in the water not catching fish. Upon catching a fish, it is natural to want ot capture the moment, snapping a picture (again through the app) marks the moment and location at which the fish was actually caught.

Measuring Success

Putting these variables together into a meaningful metric became the next problem.

In attempting to determine that which variables lead to success, we need a clear definition of success. To some extent this requires distinguishing causal variables from outcome variables. In order to determine how a fisherman would consider themselves successful, I could think of no better way than to interview a recreational fisherman.

You find gold where the gold is

— Prospector’s Proverb

I went fishing.

Standing waste deep in water at my local lake, with a line in the water, I began pondering the variables that would make me successful at that moment. As I stood there, I realised it was “catching a lot of fish”, catching fish is exciting, even small fish. So there is an element of quantity. The sheer amount of fish you catch, is a positive experience. But standing in a lake for 4 hours to catch two fish is not the same as hitting 2 fish in 20 minutes: the velocity at which we catch fish matters.. It was at t his moment that I caught the most impressive Small-mouth Bass I have ever caught. A big fish, on a small fishing rod, is a fun experience. Fisherman brag about that giant fish they caught. So its not only a measure of quantity of fish caught, but also the quality of those fish.

A fishing trip can be measured as being successful by the “velocity” at which you catch fish.

velocity = quantity of things
------------------
time spent

There is a little more to it, as we want to factor in the quality of the fish caught. We can also consider the entire time spent as a single “fishing trip”.

velocity = number of fish caught * quality of fish
---------------------------------------
time spent standing next to water

The problem with this definition was that it was not very granular. The end goal was to create a heat map representing the “best” places to go fishing. This heat map would represent a range from null (no information) to good to bad. So while a “trip” represents a range of space-time (different fish caught at different locations and at different times), I required highly granular data that specified a point in space-time.

A hypothetical fishing trip with two fish caught. The fish took a certain amount of time to catch which represents an effort on my part. [original]

Here I take a cue from accounting. In addition to the point in time when the success was achieved, I can measure the time in between as a the cost of catching the fish.

In the example above, the Small-Mouth only took 20 minutes to catch, however I continued to fish without catching another for a full 50 minutes. We therefore allocate the unsuccessful time to the nearest fish caught.

Given this new perspective, we can change the measure of success to:

score =  quality of the fish
--------------------------------
time spent fishing for that fish

While the trip score can be considered the average of all the individual fish scores.

We are getting closer to a simple score with a clear definition of how to measure the time per fish. Unfortunately, the definition of “quality” of a fish is not a straight forward measure either.

Generally size is considered the measure of a successful catch, but not all fish are considered equal. If I am fishing in a mountain stream and catch a “good sized trout” it is going to be a very different size from even a small Great White Shark. Age is also a factor: young fish will be smaller than older fish.

Another wrinkle enters when we consider why the government was going to be collecting this data: fisheries management. They want to know the health of the regional ecosystem. Under these conditions it is not sufficient to know that the fish is “bigger”, but that it is an appropriate weight for its age. High or low values could indicate various stresses on the population.

A species’ Standard Weight is a measure of the average size of a fish given its height. This is basically BMI for fish: given a fish’s length, we can consider its normal weight. This weight follows an exponential curve (fish get fatter faster than they get longer), and is unique to each species with each species having two constants that define their normal curve.

Two Standard Weights for two different species. As length increases so does weight, but at different rates. (Wikipedia CCSA-3.0)

In order to make use of this piece of Standard Weights a database of the a and b parameter for each species is required. Fortunately that database exists in the form of Fish Base, an online catalogue of fish research from around the world.

To identify a fish’s parameters, all that is required is to know the fish species (common name is acceptable) and the location it was caught. This results in a a page of information about a fish, including its Standard Weight. For example, our Small Mouth Bass: (a) 0.0129, and (b) 3.06, (len) 8" or 216 mm, (weight) 1/3 lbs or 153g

The mean `a` and `b` values are offered in the footnotes.

Given this information, we can consider the “quality” of a fish to be its variance from its Standard Weight. Note, that for convenience, scores are shifted to a positive range (catching a fish is always a good thing) between 0 and 1000 ( because per mille has always amused me)

stdWeight = StdWt(a,b,mm)
= StdWt(0.0129,3.0600,203)
= 156.3g
quality = weight - stdWeight
------------------
stdWeight
= 153g - 156.3g
-------------
156.3g
= -0.021113243762 (for convenience, convert to per mille)
= floor(-0.021113243762 / 2 + 0.5)
= 489‰

This should be further converted to the score by integrating the time spent fishing for the fish:

score = quality / time
= 489 / 70
= 7

Using the standardised measure of quality method gives us a measure of quality of each individual fish caught, which allows us to produce aggregate values (such as the trip score) without concern for species variability.

Analytics

By query a space-time bounding box, a user can view places where the fishing has been particularly good, or particularly bad. This was charted using OpenStreetMap, Leaflet.js, and Heatmap.js. This is not only useful for fisherman, but also for ecologists looking to study the quality of the fish in the area.

The default example from HeatMap.js (demo)

With sufficient observations, this standardisation of the data would allows for several other types other analysis

  • Year over year analysis is possible allowing ecologists to monitor for trends of declining or recovering populations.
  • Species comparative analysis can show one species filling in for another species, a common symptom of an environment in distress.
  • Time of day analysis, or seasonality, can improve catch rates by fisherman
  • Poaching rates observed by anonymously reported poaching

This finally made me realise there were two target audiences for this data, other than just fisherman:

Fisheries Management

The ability to turn fishermen’s stories into meaningful analytics meant fisheries don’t need to wait until the end of the “season” to get paper records. Real-time catch data, collected customers, can be used to gain insight into the health of bodies of water. This means early interventions can be taken.

The Ecosystem

Fish populations are early warning signs of ecological disasters. Changes in types and sizes of fish populations is a good indicator of ecological health of the water. These analytics give real-time, and early detection of ecological issues.

Fisherman don’t want to fish an over-fished area. Ecologist don’t want fishing happening in stressed areas. A system like this was hoped to identify healthy populations of fish and direct fisherman toward those; leaving stressed populations alone to recover.

Conclusion

FishAlytics was a failed social experiment for me. After nearly a decade, I abandoned the project without it ever moving past a personal fishing journal. Competing priorities (I was a farm labourer), legal obligations (I didn’t wnat to loose intellection property while working for some companies), and theft (a customer I pitched this too, pitched it back to my students 4 years later), left me just not working on it and finally letting it go.

But I don’t think it was a waste of time either. It has offered me my first real introduction into the possibility of using Data Analysis as a tool for social change. Also, I started to conceive of how passive pressure could enact social change.

It also introduced me to the idea that everything can, and often should, be boiled down to a single metric of health. This has been beneficial to me in other Data Analysis roles where being able to identify variance in an abstract metric of “health” has allowed for early intervention.

While I failed to achieve any of the desired results, I hope this article helps others see software and systems as more than just forms on a page. Rather to view them as tools for evolving a social systems, and modifying behaviour for the the betterment of all.

Further Reading

Honestly, I would love for someone to pick up the torch, so the first reading would be to fork FishAlytics

Though, I will say, it was my first foray into server side JavaScript, so I’m not particularly proud of the code.

There are a few libraries that would be highly useful for this project

  • Leaflet.js: a mapping library for interacting in the browser
  • Plotly: a general charting library. Not used in the project, but a staple in things I do now
  • Heatmap.js: the library used for integrating heatmaps in FishAlytics

Also, if you are interested in enacting social change through software development, I must recommend some general reading

Finally, while I have suggested tools that I believe are useful; with great power comes great responsibility. I therefore leave you with this warning from Charles Goodhart: “Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes”, or more simply put

When a measure becomes a target, it ceases to be a good measure

Goodhart’s Law

If you enjoyed, please leave a comment or ask a question, and remember to click on follow to encourage future articles.

Also, consider the sequel:

UDPATE: 2023–08–08

Since writing this, I use FishBrain which is basically the same tool. They have been successful in the exact domain I failed… good on them.

--

--

Jefferey Cave

I’m interested in the beauty of data and complex systems. I use story telling to help others see that beauty. https://www.buymeacoffee.com/jeffereycave