Data Manipulation & Cleaning with Python & R -Part 1

Udara Vimukthi
3 min readMay 5, 2021

--

Part 1 — Python Basic related to Data Science

1.What is Python?

Python is a very popular interpreted, high-level, general-purpose programming language. Python’s design philosophy emphasizes code readability with its notable use of significant whitespace

Its language constructs and object-oriented approach aim to help programmers write clear, logical code for small and large-scale projects.

2. Python IDEs

There are popular IDEs that can use with python related to the Data science field. An IDE (Integrated Development Environment) understands your code much better than a text editor. It usually provides features such as build automation, code linting, testing, and debugging.

E.g: Jupyter, Spyder, Pycharm, Anaconda, IDLE

3.Variables in Python

A Python variable is a reserved memory location to store values In other words, a variable in a python program gives data to the computer for processing

4. Comments in Python

In computer programming, a comment is a programmer readable explanation or annotation in the source code of a computer program, They are added with the purpose of making the source code easier for humans to understand, and are generally ignored by compilers and interpreters

5.Primitive Data Types of Python

Data types are pre-defined and supported by the programming language. Python has 4 primitive data types

  • Integers
  • Floats
  • Booleans
  • Strings

6. Compound Data Types

Compound data types in Python are quite similar to collections in Java. We have lists, tuples, sets, and dictionaries.

a. Lists

  • Lists are used to store multiple items in a single variable
  • Square brackets are used
  • Lists are mutable
  • Lists can contain any type of data together
  • Indexing is starting from 0
  • Lists allow negative indexing
  • Several functions are available for operations in lists

b. Tuples

  • Tuples are used to store multiple items in a single variable
  • Curve brackets are used
  • Tuples are not mutable
  • Tuples can contain any type of data together
  • Indexing is starting from 0
  • Tuples allow negative indexing
  • Several functions are available for operations in tuples

c. Sets

  • Sets are used to store multiple items in a single variable
  • Curly brackets are used
  • Sets are not mutable
  • Sets can contain any type of data together
  • No indexing in sets
  • Several functions are available for operations in sets

d. Dictionary

  • Dictionaries have keys and values
  • Curly brackets are used
  • Dictionaries are mutable
  • Dictionaries can contain any type of data together
  • Can access the values through keys
  • Several functions are available for operations in dictionaries

7. What are Python functions

  • Python functions are repeated blocks that will be used in several places in a program
  • Python functions can be created with arguments or without arguments
  • Python functions can be created with return values or without return values

8. What is Python Strings

  • A primitive data type of Python
  • Not mutable
  • Indexing is started from 0
  • Negative indexing is allowed
  • Several functions are available for doing string operations

In this article, we are discussed Python basics which are related to Data Science. Data wrangling and preprocessing parts are discussed in Part 2 of the next article.

--

--

Udara Vimukthi
Udara Vimukthi

No responses yet