Resetting AWS Account

Author jyablonski

Updated May 13, 2026

Tags runbookawsterraforminfrastructure

Resetting the AWS account recreates project infrastructure in a new account, migrates required source data, and reconnects deployment pipelines and DNS.

Runbook Steps

Run Part 1 of the SQL Script (code below) to save all source data from the database, including:
1. bronze Schema Tables
2. ML Tables in silver and gold Schemas
3. public Schema Tables
- NOTE the dbt tables can always be refreshed as long as the source data is extracted, so those aren’t pulled here
Run terraform destroy to delete infrastructure in the current AWS Account
Run aws-nuke -c aws_nuke.yml --no-dry-run to delete all non-Terraform resources in the current AWS Account
1. Ensure all S3 Buckets are deleted before proceeding to the next step
  1. Either run aws s3 rm s3://example-bucket --recursive to remove all bucket contents, or manually press Empty on each bucket.
Create new example@gmail.com email
Create a new AWS Account with a new email
Add Root MFA onto new AWS Account
Go to Account and enable IAM Policy for Billing Access
Create sign-in IAM User and grant it AdministratorAccess and Billing IAM Policies
Add MFA onto sign-in user
Create Access / Secret Keys for sign-in user
Create terraform-user and grant AdministratorAccess Policy
Create Access / Secret Keys for Terraform User
Update Keys in Terraform Repo & Terraform Cloud
Run terraform apply to build all infrastructure in New Account (1st run)
- NOTE This will likely fail on ECS Services as the Docker Images are missing. They’ll be added later
Go to Squarespace, open DNS -> Domain Nameservers, and update them based on the NS values in the Route 53 Hosted Zone
Add jyablonski9@gmail.com Email Verified Identity in SES manually & accept email in inbox afterwards
- AWS SES Link
Run Liquibase Migrations to build source tables
Run Part 2 of the SQL Script (code below) to store all source data into the new RDS Database
Update IAM ROLE GitHub Actions Secret on the following Repositories:
1. nba_elt_ingestion
2. aws_terraform
3. nba_elt_dbt
4. nba_elt_rest_api
5. nba_elt_mlflow
6. jyablonski_liquibase
Run all CI / CD Pipelines to build and push the Docker images for each service to ECR
Run terraform apply to build any remaining infrastructure in New Account (2nd run)
Profit

SQL Script Code

SQL Save & Reload Script

from datetime import datetime
import os

from jyablonski_common_modules.sql import create_sql_engine
import pandas as pd

engine = create_sql_engine(
    database=os.environ.get("RDS_DB"),
    schema="bronze",
    user=os.environ.get("RDS_USER"),
    password=os.environ.get("RDS_PW"),
    host=os.environ.get("IP"),
    port=17841,
)

ml_models_tables = ["ml_game_predictions"]

nba_prod_tables = [
    "feature_flags",
    "feature_flags_audit",
    "incidents",
    "incidents_audit",
    "rest_api_users",
    "rest_api_users_audit",
    "user_predictions",
    "user_predictions_audit",
]

bronze_tables = [
    "player_attributes",
    "play_in_details",
    "boxscores",
    "internal_player_attributes",
    "reddit_posts",
    "reddit_comments",
    "bbref_player_contracts",
    "bbref_league_transactions",
    "bbref_player_stats_snapshot",
    "bbref_team_opponent_shooting_stats",
    "bbref_team_preseason_odds",
    "bbref_player_pbp",
    "internal_league_inactive_dates",
    "twitter_tweets",
    "twitter_tweepy_legacy",
    "bbref_player_boxscores",
    "bbref_team_adv_stats_snapshot",
    "aws_twitter_tweets_source",
    "internal_team_top_players",
    "internal_team_attributes",
    "draftkings_game_odds",
    "bbref_player_shooting_stats",
    "bbref_player_injuries",
    "bbref_league_schedule",
    "staging_seed_player_attributes",
    "staging_seed_team_attributes",
    "staging_seed_top_players",
]

public_tables = [
    "rest_api_users",
    "user_predictions",
]


def store_sql_table(connection, table: str, schema: str):
    todays_date = datetime.now().date()
    year = todays_date.year
    output_dir = f"tables/{year}/{schema}"
    output_path = f"{output_dir}/{table}-{todays_date}.parquet"

    try:
        df = pd.read_sql(f"SELECT * FROM {schema}.{table};", con=connection)
        print(f"Queried {len(df)} records from {schema}.{table}")

        # Create directory if it doesn't exist
        os.makedirs(output_dir, exist_ok=True)

        df.to_parquet(output_path)
        print(f"Wrote {schema}.{table} to {output_path}")

    except BaseException as e:
        print(f"Error occurred while reading {schema}.{table}: {e}")
        raise


with engine.connect() as connection:
    for table in ml_model_tables:
        store_sql_table(connection=connection, table=table, schema="gold")

    for table in gold_tables:
        store_sql_table(connection=connection, table=table, schema="gold")

    for table in bronze_tables:
        store_sql_table(connection=connection, table=table, schema="bronze")