Python – HuntDataScience

Group the people by group size

January 3, 2020January 6, 2020 Vinoth

There are n people whose IDs go from 0 to n - 1 and each person belongs exactly to one group. Given the array groupSizes of length n telling the group size each person belongs to, return the groups there are and the people’s IDs each group includes.

You can return any solution in any order and the same applies for IDs. Also, it is guaranteed that there exists at least one solution.

Example 1:

Input: groupSizes = [3,3,3,3,3,1,3]
Output: [[5],[0,1,2],[3,4,6]]
Explanation: 
Other possible solutions are [[2,1,6],[5],[0,4,3]] and [[5],[0,6,2],[4,3,1]].

Example 2:

Input: groupSizes = [2,1,3,3,3,2]
Output: [[1],[0,5],[2,3,4]]

Solution link github

Solution:

class Solution:
def groupThePeople(self, groupSizes: List[int]) -> List[List[int]]:
check = {}

res =[]
for i in range(len(groupSizes)):
s = groupSizes[i]
if s not in check:
check[s] = []
check[s].append(i)
if len(check[s])==s:
res.append(check[s])
check[s]=[]
return res

class Solution:
def groupThePeople(self, groupSizes: List[int]) -> List[List[int]]:
listind=groupSizes

listdif=[]
listdif1=[]
for i in range(0,len(listind)):
if(listind[i][1]==listind[i+1][1]):
listdif.append(listind[i][0])
#print(listind[i][0])

elif(listind[i+1][1]==listind[i-1][1]):
listdif1.append(listind[i][0])
#print(listind[i+1][0])
break
#elif(listind[0][1]==listind[i+1][1])

else:
#print(“””——-“””)
listdif.append(listind[i][0])
listdif.append(listind[len(listind)-1][0])
#print(listind[i][0])

def split_list(a_list):
half = len(a_list)//2
return a_list[:half], a_list[half:]

B, C = split_list(listdif)
print(listdif1,B,C)

Machine Learning, Python, Structure Thinking

Graph Algorithm

December 27, 2019 Vinoth

The listed graph algorithm, might we have used anyone in our projects or the social media (LinkedIn, Facebook) or the real time application like google maps and etc.

Connected Components: in very layman’s terms as a sort of a hard clustering algorithm which finds clusters/islands in related/connected data.
Shortest Path: is called Dijkstra is used extensively in Google Maps to find the shortest routes.
Minimum Spanning Tree: We work for a water pipe laying company or an internet fiber company. We need to connect all the cities in the graph we have using the minimum amount of wire/pipe.
Pagerank : It has been used for finding the most influential papers using citations. Has been used by Google to rank page
Centrality Measures : Betweenness centrality quantifies how many times a particular node comes in the shortest chosen path between two other nodes.

Link: https://mlwhiz.com/blog/2019/09/02/graph_algs/?utm_campaign=data-scientists-the-5-graph-algorithms-that-you-should-know&utm_medium=social_link&utm_source=missinglettr

Python

Playing with Regex (Python)

February 13, 2019February 13, 2019 sundarakesavan

A regular expression is a special sequence of characters that helps you match or find other strings or sets of strings, using a specialized syntax held in a pattern.

The module re provides full support for regular expressions in Python. The re module raises the exception re.error if an error occurs while compiling or using a regular expression.

There are various characters, which would have special meaning when they are used in regular expression. To avoid any confusion while dealing with regular expressions, we would use Raw Strings as r’expression’.

Basic Patterns

The power of regular expressions is that they can specify patterns, not just fixed characters. Here are the most basic patterns which match single chars:

a, X, 9, < — ordinary characters just match themselves exactly. The meta-characters which do not match themselves because they have special meanings are: . ^ $ * + ? { [ ] \ | ( ) (details below)
. (a period) — matches any single character except newline ‘\n’
\w — (lowercase w) matches a “word” character: a letter or digit or underbar [a-zA-Z0-9_]. Note that although “word” is the mnemonic for this, it only matches a single word char, not a whole word. \W (upper case W) matches any non-word character.
\b — boundary between word and non-word
\s — (lowercase s) matches a single whitespace character — space, newline, return, tab, form [ \n\r\t\f]. \S (upper case S) matches any non-whitespace character.
\t, \n, \r — tab, newline, return
\d — decimal digit [0-9] (some older regex utilities do not support but \d, but they all support \w and \s)
^ = start, $ = end — match the start or end of the string
\ — inhibit the “specialness” of a character. So, for example, use \. to match a period or \\ to match a slash. If you are unsure if a character has special meaning, such as ‘@’, you can put a slash in front of it, \@, to make sure it is treated just as a character.

Repetition

Things get more interesting when you use + and * to specify repetition in the pattern

+ — 1 or more occurrences of the pattern to its left, e.g. ‘i+’ = one or more i’s
* — 0 or more occurrences of the pattern to its left

Searching an occurrence of Pattern

re.search() : This method either returns None (if the pattern doesn’t match), or a re.MatchObject that contains information about the matching part of the first occurrence of string. This method stops after the first match, so this is best suited for testing a regular expression more than extracting data.

re.match() : This function attempts to match pattern to beginning of string. The re.match function returns a match object on success, None on failure.

re.findall() : Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found.

Lets see examples here about what we discussed above

Pandas

Difference between map, apply map and apply methods in Pandas

February 3, 2019February 3, 2019 sundarakesavan

applymap() method only works on a pandas dataframe where function is applied on every element individually.

apply() method can be applied both to series and dataframes where function can be applied both series and individual elements based on the type of function provided.

map() method only works on a pandas series where type of operation to be applied depends on argument passed as a function, dictionary or a list.

Please find the examples here :
Different Apply functions of Pandas.ipynb

Python