Appearance
Dropping Columns from a DataFrame - .drop() 
Overview 
The drop() function is used to remove one or more columns from a DataFrame. It allows you to eliminate unnecessary columns from the DataFrame to focus on relevant data or to streamline further data processing. The drop() function returns a new DataFrame with the specified columns removed, leaving the original DataFrame unchanged.
Drop a Single Column 
You can use the drop() function to remove a single column from the DataFrame by providing the column name as an argument.
python
from pyspark.sql import SparkSession
# Create a SparkSession (if not already created)
spark = SparkSession.builder.appName("DropColumnExample").getOrCreate()
# Sample data as a list of dictionaries
data = [
{"name": "Alice", "age": 30, "country": "USA"},
{"name": "Bob", "age": 25, "country": "Canada"},
{"name": "Charlie", "age": 35, "country": "UK"},
]
# Create a DataFrame
df = spark.createDataFrame(data)
# Drop a single column
df_dropped = df.drop("age")
df_dropped.show()Output:
+-------+-------+
|   name|country|
+-------+-------+
|  Alice|    USA|
|    Bob| Canada|
|Charlie|     UK|
+-------+-------+Drop Multiple Columns 
You can also use the drop() function to remove multiple columns from the DataFrame by providing a list of column names as arguments.
python
from pyspark.sql import SparkSession
# Create a SparkSession (if not already created)
spark = SparkSession.builder.appName("DropColumnExample").getOrCreate()
# Sample data as a list of dictionaries
data = [
{"name": "Alice", "age": 30, "country": "USA"},
{"name": "Bob", "age": 25, "country": "Canada"},
{"name": "Charlie", "age": 35, "country": "UK"},
]
# Create a DataFrame
df = spark.createDataFrame(data)
# Drop multiple columns
df_dropped = df.drop("age", "country")
df_dropped.show()Output:
+-------+
|   name|
+-------+
|  Alice|
|    Bob|
|Charlie|
+-------+The drop() function in PySpark is a useful tool for removing columns from a DataFrame that are not needed for analysis or further processing. Whether you need to drop a single column, or multiple columns, the drop() function allows you to efficiently customize your DataFrame by eliminating unwanted columns while preserving the essential data for your analysis.
📖👉 Official Doc