In: Computer Science
How can I reduce data set by deleting any rows that have all
FALSE bool values for every column in
that row using pandas. Assuming there are 20+ columns/rows to loop
through. Example: The table data below the pandas code should
drop/reduce the data to remove the second & fifth row.
True and False in the table are dtype bool.
id | Test1 | value1 | value2 | value3 | value4 |
0.1 | 1 | False | False | False | False |
0.2 | 2 | False | True | True | False |
0.3 | 3 | True | False | False | False |
0.4 | 4 | False | False | False | False |
Below is a screen shot of the python program to check indentation. Comments are given on every line explaining the code. Below is the output of the program: Contents of del.csv: Console output: Below is the code to copy: #CODE STARTS HERE----------------
import pandas as pd #reading sample csv from a file df = pd.read_csv("del.csv", delimiter="\t") print("ORIGINAL DATAFRAME:") print(df) #prints original dataframe print("\nMODIFIED DATAFRAME:") #df.iloc[:,2:] => selects all rows with columns from 2 to end. This removes 'test1' and 'id' #.any(axis=1) => returns true if there is atleast 1 'True' in the column #df[~df.iloc[:,2:].any(axis=1)] => prints rows where all values are 'False' print(df[~df.iloc[:,2:].any(axis=1)]) #Prints modified df #CODE ENDS HERE-----------------