Welcome, guest! Login / Register - Why register?
Psst.. new poll here.
Psst.. new forums here.
Microsoft is blocking us again (TY IP Reputation!) so just use oauth login instead. :)

Paste

Pasted as Python by Manohar ( 8 years ago )
# group rows into list of dataframes w.r.t trip_id's
groups = [df for _, df in df.groupby(['trip_id'])]

# find the trips in each class
trip_classes = {}
for i in range(len(groups)):
    trip_id = groups[i]['trip_id'].unique()[0]
    class_name = groups[i]['class'].unique()[0]
    if (trip_classes.get(class_name) == None):
        trip_classes[class_name] = []
    trip_classes[class_name].append(trip_id)

# for each class, shuffle the trip_id's in trip_classes
random.seed(2)
for key in trip_classes:
    random.shuffle(trip_classes[key])

# train test split in 7:2 ratio
train_trip_ids = []
test_trip_ids = []
for key in trip_classes:
    train_trip_ids.extend(trip_classes[key][0:7])
    test_trip_ids.extend(trip_classes[key][7:9])

# get the train and test data
train = df[df.trip_id.isin(train_trip_ids)]
test = df[df.trip_id.isin(test_trip_ids)]

 

Revise this Paste

Your Name: Code Language: