Operation ML SpyCraft - Week 4

Welcome to ML SpyCraft Week, where intelligence builds thinking solutions.
This is a continuation of Week 2, but this time you will be building ML Models. Choose one of the three (3) missions below and submit a short report of your findings and your Model. Sample code has been provided, but you are free to go all out with your custom code.

Submission Guidelines:

1) Submit a 2-page case report with your findings, recommendations and data visuals to support your case, in PDF format

2) Ensure the file name includes your name and the mission name, e.g. bossteam_operationdoubleagent

Note, you can work as a team when analysing the data and building your models, but work on your reports separately, we want to see your innnovativeness shine through.

Submit your Analysis here

Mission 1: Operation Double Agent - 100 points

Objective: We need to quickly identify potential Double Agents, build a Linear Regression model to identify the strongest predictors of a double agent being a threat, measured by a quantitative value the IMPACT Agency has uncovered, 'threat_score'. Your findings will allow us to quickly trigger control measures against them.

Your Data:
weekly_coded_messages Integer. Number of coded or encrypted messages sent or received weekly by the agent.

unusual_financial_transactions Integer. Count of irregular or suspicious financial transactions identified for the agent.

hours_on_dark_web Float. Average number of hours the agent spent on dark web–related activities per week.

data_leaks_mb Float. Estimated volume of data leaks (in megabytes) attributed to or involving the agent.

threat_score Float. Composite risk index (0–100) reflecting the agent’s potential threat level based on all contributing factors.

Double Agent Data here

 

Mission 2: Encrypted Threats - 100 points

Objective: We got a tip that Double Agents have been using our encypted message network to leak sensitive information. Build a Linear Regression model to classify Encrypted Message Classification as threats to the IMPACT Spy Agency. These findings wil help our Counterintelligence Unit track them down.

Your Data:
message_length Integer. Total number of characters or tokens in the message sent. Represents how long each communication is.

encryption_strength Integer. Encryption level used for the message, typically ranked from 1 (weak) to 10 (strong).

origin_country Categorical (String). The country from which the message originated.

time_of_day_sent_hour Integer. The hour of the day (0–23) when the message was sent. Useful for temporal analysis.

is_threat Binary (0 or 1).Target variable indicating whether the message was classified as a potential threat (1) or not (0).

Download Message Classification Dataset

 

Mission 3: The Mole Profiler - 100 points

Objective: We have identified a secret network of Moles in our organisation. Build a Decision Tree Model to identify the top two most critical behaviours/attributes we can use to identify a mole. These insights will be used to tighten our security monitoring. Explain your reasoning by tracing the paths from the root of the tree.

Your Data:
years_of_service Integer. Total number of years the agent has been employed in the organization.

secure_files_accessed Integer. Number of times the agent accessed secure or classified files.

off_hours_logins Integer. Count of times the agent logged in outside regular working hours.

performance_review_score Float. The agent’s latest performance review score ranging from 0 (poor) to 5 (excellent).

travel_to_hostile_nation Binary (0 or 1). Indicates if the agent has recently travelled to a nation considered hostile (1 = yes, 0 = no).

is_mole Binary (0 or 1). Target variable indicating if the agent was identified as a mole (1) or not (0).

Download Agent Mole Profiles

 
 
Previous
Previous

Storm Chasers: The Impact AI Mission for Jamaica’s Recovery

Next
Next

Operation Mindframe - Week 3