Web Scraping Reddit with Scrapy

1. install scrapy

you need install Microsoft Visual C++ 14.0 from https://visualstudio.microsoft.com/thank-you-downloading-visual-studio/?sku=BuildTools&rel=16, then pip install scrapy.

2. create scrapy project

C:\Users\zhuby\hans>scrapy startproject reddit
New Scrapy project ‘reddit’, using template directory ‘c:\python\lib\site-packages\scrapy\templates\project’, created in:
C:\Users\zhuby\hans\reddit

You can start your first spider with:
cd reddit
scrapy genspider example example.com

3. C:\Users\zhuby\hans\reddit\reddit\spiders>code redditspider.py

import scrapy

class RedditSpider(scrapy.Spider):
    name = "reddit"
    start_urls = ["https://www.reddit.com/r/cats"]

    def parse(self, response):
        links = response.xpath("//img/@src")
        html =""

        for link in links:
            url = link.get()
            if any(extension in url for extension in [".jpg", ".gif", ".png"]):
                html += """<a href="{url}"
                target="_blank">
                <img src="{url}" height="33%" width="33%">
                </a>""".format(url=url)

                with open("frontpage.html", "a") as page:
                    page.write(html)
                    page.close()

4. test the redditspider.py

C:\Users\zhuby\hans\reddit>scrapy crawl reddit
2020-06-08 16:14:25 [scrapy.utils.log] INFO: Scrapy 2.1.0 started (bot: reddit)
2020-06-08 16:14:25 [scrapy.utils.log] INFO: Versions: lxml 4.5.1.0, libxml2 2.9.5, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 20.3.0, Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)], pyOpenSSL 19.1.0 (OpenSSL 1.1.1g 21 Apr 2020), cryptography 2.9.2, Platform Windows-10-10.0.19041-SP0
2020-06-08 16:14:25 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
2020-06-08 16:14:25 [scrapy.crawler] INFO: Overridden settings:
{‘BOT_NAME’: ‘reddit’,
………………..
then you will get file:///C:/Users/zhuby/hans/reddit/frontpage.html

call openweathermap API with python3

1. Create an Account on https://home.openweathermap.org/ and create your API key, save it into config.ini:

[openweathermap]
api=7383374766289c804cbf5a68ac704491

2. create get_weather.py

import configparser
import requests
import sys

def get_api_key():
    config = configparser.ConfigParser()
    config.read('config.ini')
    return config['openweathermap']['api']

def get_weather(api_key, location):
    url = "https://api.openweathermap.org/data/2.5/weather?q={}&units=metric&appid={}".format(location, api_key)
    r = requests.get(url)
    return r.json()

def main():
    if len(sys.argv) != 2:
        exit("Usage: {} LOCATION".format(sys.argv[0]))
    location = sys.argv[1]

    api_key = get_api_key()
    weather = get_weather(api_key, location)

    print(weather['main']['temp'])
    print(weather)

if __name__ == '__main__':
    main()
  1. test code:
    C:\Users\zhuby\hans>python get_weather.py Toronto
    20.2
    {‘coord’: {‘lon’: -79.42, ‘lat’: 43.7}, ‘weather’: [{‘id’: 800, ‘main’: ‘Clear’, ‘description’: ‘clear sky’, ‘icon’: ’01d’}], ‘base’: ‘stations’, ‘main’: {‘temp’: 20.2, ‘feels_like’: 18.43, ‘temp_min’: 18, ‘temp_max’: 22.78, ‘pressure’: 1020, ‘humidity’: 52}, ‘visibility’: 14484, ‘wind’: {‘speed’: 2.6, ‘deg’: 100}, ‘clouds’: {‘all’: 1}, ‘dt’: 1591637542, ‘sys’: {‘type’: 1, ‘id’: 941, ‘country’: ‘CA’, ‘sunrise’: 1591608974, ‘sunset’: 1591664262}, ‘timezone’: -14400, ‘id’: 6167865, ‘name’: ‘Toronto’, ‘cod’: 200}

Make a bootable USB drive to install Windows 10

1. Make a bootable USB drive with the Windows utility program DiskPart

If you dare to do the necessary work by hand, you can simply use the cmd.exe application, better known as “Command Prompt”, to create a bootable USB drive on all operating systems from Windows Vista (including Windows 10). This goes as follows:

Plug the USB drive into your computer’s USB port.
Search for the “cmd” application in the Windows start menu, right-click on the item, and select “Run as administrator” from the context menu. This opens a small window with white text on a black background.
Type the command “diskpart” and confirm your input with the enter key (you’ll also do this after every other entered command). This starts the storage device manager.
Enter the command “list disk” to display all available storage devices.
You can recognize your USB by its storage capacity, and it’s usually listed as “disk 1”. In the system partition, “disk 0” is usually your PC, so a hard drive or solid state drive in your computer.
Based on the assumption that your USB has the label “disk 1”, enter the command “sel disk 1” to select it (or the corresponding “disk 2”, etc.).
Enter then command “clean” to delete all files from the USB.
Enter the command “create partition primary” to create a main partition.
Enter the command “list par” and select the newly created main partition with “sel par 1”.
Activate the partition with the command “active”.
Format the USB with the command “format fs=FAT32 label=“WINDOWSUSB” quick override” (in place of “WINDOWS USB” you can also choose another label, so long as it doesn’t contain any spaces or special characters. The drive will later be displayed under this name if you plug into a running Windows computer). Formatting may take a while. You can track its progress in the percentage bar.
As soon as the process is finished, enter the command “assign” to automatically assign a drive letter (for example “G:”) to your USB.
Enter “exit” to close DiskPart, and then “exit” again to close the command prompt.

To finish the process, you just have to copy the Windows ISO file to a bootable USB stick. This is done with a basic drag-and-drop. If you’re using an installation disc, you can also drag all setup files from there onto your drive (use the folder options to display all of the hidden files first). That’s all possible in the command prompt as well. For a source media with the drive letter “D:” and a USB drive with the letter “G:”, the corresponding command would look as follows: “xcopy D:*. G:*. /S /E /F” (all of the spaces are intentional).

2. create Bootable USB with Rufus

Rufus is widely considered to be the fastest and most reliable tool for the creation of a bootable USB. It also supports UEFI (“Unified Extensible Firmware Interface”), a new mainboard firmware that replaced the old BIOS and can already be found on almost all newer computers. From Windows 8, it’s also possible to install “Windows2Go” as a portable operating system on an external storage device with Rufus.

Operation of the tool is simple:
Open the program with a double-click
Select your USB drive in “Device”
Select “Create a bootable disk using” and the option “ISO Image”
Right-click on the CD-ROM symbol and select the ISO file
Under “New volume label”, you can enter whatever name you like for your USB drive
You’ll receive the warning “ALL DATA ON THIS DEVICE WILL BE DESTROYED”, which you can confidently confirm with “OK”– at this point, you’ve ideally already saved any important files from the USB drive
Click on “Start”
As soon as the green bar is full, click on “Finish”
Eject your bootable USB drive with “Safely eject hardware”

3. Create Windows 10 installation media

https://www.microsoft.com/en-ca/software-download/windows10

click "Download tool now"
then select "Using the tool to create installation media (USB flash drive) to install Windows 10 on a different PC"

4. after bootable USB created, you need change your computer BIOS settings to make it startup from USB first!

reboot and start Windows 10 installation!

CRUD Operations in Python on MySQL

1. install MySQL on Ubuntu

sudo apt update
sudo apt install mysql-server
sudo mysql_secure_installation

Verify the installation:
sudo mysql
mysql> SELECT user,authentication_string,plugin,host FROM mysql.user;
mysql> ALTER USER ‘root’@’localhost’ IDENTIFIED WITH mysql_native_password BY ‘password’;
mysql> FLUSH PRIVILEGES;
mysql> exit
$ mysql -u root -p to test your login

2. ubuntu@ubunu2004:~$ pip3 install mysql-connector-python

create db.py to create db, table and records

import mysql.connector #Importing Connector package   
mysqldb=mysql.connector.connect(host="localhost",user="root",password="password")#established connection   
mycursor=mysqldb.cursor()#cursor() method create a cursor object  
mycursor.execute("create database dbpython")#Execute SQL Query to create a database    
mysqldb.close()#Connection Close  

#Create a table into dbpython database  
import mysql.connector  
mysqldb=mysql.connector.connect(host="localhost",user="root",password="password",database="dbpython")#established connection between your database   
mycursor=mysqldb.cursor()#cursor() method create a cursor object  
mycursor.execute("create table student(roll INT,name VARCHAR(255), marks INT)")#Execute SQL Query to create a table into your database  
mysqldb.close()#Connection Close  

import mysql.connector  
mysqldb=mysql.connector.connect(host="localhost",user="root",password="password",database="dbpython")#established connection between your database  
mycursor=mysqldb.cursor()#cursor() method create a cursor object    
try:  
   #Execute SQL Query to insert record  
   mycursor.execute("insert into student values(1,'Sarfaraj',80),(2,'Kumar',89),(3,'Sohan',90)")  
   mysqldb.commit() # Commit is used for your changes in the database  
   print('Record inserted successfully...')   
except:  
   # rollback used for if any error   
   mysqldb.rollback()  
mysqldb.close()#Connection Close  

ubuntu@ubunu2004:~$ python3 db.py
Record inserted successfully…
you can also check in MySQL:
mysql> show databases;
mysql> use dbpython;
mysql> show tables;
+——————–+
| Tables_in_dbpython |
+——————–+
| student |
+——————–+
1 row in set (0.00 sec)
mysql> select * from student;
+——+———-+——-+
| roll | name | marks |
+——+———-+——-+
| 1 | Sarfaraj | 80 |
| 2 | Kumar | 89 |
| 3 | Sohan | 90 |
+——+———-+——-+
3 rows in set (0.00 sec)

3. create update_record.py

import mysql.connector
mysqldb=mysql.connector.connect(host="localhost",user="root",password="password",database="dbpython")#established connection between your database
mycursor=mysqldb.cursor()#cursor() method create a cursor object
try:
   mycursor.execute("UPDATE student SET name='Ramu', marks=100 WHERE roll=1")#Execute SQL Query to update record
   mysqldb.commit() # Commit is used for your changes in the database
   print('Record updated successfully...')
except:
   # rollback used for if any error
   mysqldb.rollback()
mysqldb.close()#Connection Close

4. create delete_record.py

import mysql.connector
mysqldb=mysql.connector.connect(host="localhost",user="root",password="password",database="dbpython")#established connection between your database
mycursor=mysqldb.cursor()#cursor() method create a cursor object
try:
   mycursor.execute("DELETE FROM student WHERE roll=3")#Execute SQL Query to detete a record
   mysqldb.commit() # Commit is used for your changes in the database
   print('Record deteted successfully...')
except:
   # rollback used for if any error
   mysqldb.rollback()
mysqldb.close()#Connection Close

5. test the code with display_db.py

import mysql.connector
mysqldb=mysql.connector.connect(host="localhost",user="root",password="password",database="dbpython")#established connection between your database
mycursor=mysqldb.cursor()#cursor() method create a cursor object
try:
   mycursor.execute("select * from student")#Execute SQL Query to select all record
   result=mycursor.fetchall() #fetches all the rows in a result set
   for i in result:
      roll=i[0]
      name=i[1]
      marks=i[2]
      print(roll,name,marks)
except:
   print('Error:Unable to fetch data.')
mysqldb.close()#Connection Close

ubuntu@ubunu2004:~$ python3 display_db.py
1 Sarfaraj 80
2 Kumar 89
3 Sohan 90
ubuntu@ubunu2004:~$ python3 update.py
Record updated successfully…
ubuntu@ubunu2004:~$ python3 display_db.py
1 Ramu 100
2 Kumar 89
3 Sohan 90
ubuntu@ubunu2004:~$ python3 delete.py
Record deteted successfully…
ubuntu@ubunu2004:~$ python3 display_db.py
1 Ramu 100
2 Kumar 89

NOTE:

  1. if you cannot connect to MySQL from remote, you need update the binding port:
    sudo vi /etc/mysql/mysql.conf.d/mysqld.cnf
    from:
    bind-address = 127.0.0.1
    change to:
    bind-address = 0.0.0.0
    then restart MySQL
    sudo systemctl restart mysql.service
  2. if you got error msg:
    mysql.connector.errors.DatabaseError: 1130: Host ‘192.168.0.28’ is not allowed to connect to this MySQL server
    root is NOT allowed login from remote, you can create a new user and grant PRIVILEGES:
    mysql> CREATE USER ‘monty’@’%’ IDENTIFIED BY ‘somIUpass#98’;
    mysql> GRANT ALL PRIVILEGES ON . TO ‘monty’@’%’ WITH GRANT OPTION;
    then you can query from remote with this ID:
    C:\Users\zhuby\python_code>python display_db.py
    1 Ramu 100
    2 Kumar 89

Flask-RESTful API develop

We need install flask-restful first: pip install flask-restful. then develop api.py:

from flask import Flask
from flask_restful import reqparse, abort, Api, Resource

app = Flask(__name__)
api = Api(app)

TODOS = {
    'todo1': {'task': 'build an API'},
    'todo2': {'task': '?????'},
    'todo3': {'task': 'profit!'},
}


def abort_if_todo_doesnt_exist(todo_id):
    if todo_id not in TODOS:
        abort(404, message="Todo {} doesn't exist".format(todo_id))

parser = reqparse.RequestParser()
parser.add_argument('task')

# shows a single todo item and lets you delete a todo item
class Todo(Resource):
    def get(self, todo_id):
        abort_if_todo_doesnt_exist(todo_id)
        return TODOS[todo_id]

    def delete(self, todo_id):
        abort_if_todo_doesnt_exist(todo_id)
        del TODOS[todo_id]
        return '', 204

    def put(self, todo_id):
        args = parser.parse_args()
        task = {'task': args['task']}
        TODOS[todo_id] = task
        return task, 201


# TodoList
# shows a list of all todos, and lets you POST to add new tasks
class TodoList(Resource):
    def get(self):
        return TODOS

    def post(self):
        args = parser.parse_args()
        todo_id = int(max(TODOS.keys()).lstrip('todo')) + 1
        todo_id = 'todo%i' % todo_id
        TODOS[todo_id] = {'task': args['task']}
        return TODOS[todo_id], 201

##
## Actually setup the Api resource routing here
##
api.add_resource(TodoList, '/todos')
api.add_resource(Todo, '/todos/<todo_id>')


if __name__ == '__main__':
    app.run(debug=True)
$ python api.py 
* Running on http://127.0.0.1:5000/ * Restarting with reloader
GET the list
$ curl http://localhost:5000/todos 
GET a single task
$ curl http://localhost:5000/todos/todo3
DELETE a task
$ curl http://localhost:5000/todos/todo2 -X DELETE -v
Add a new task
$ curl http://localhost:5000/todos -d "task=something new" -X POST -v
Update a task
$ curl http://localhost:5000/todos/todo3 -d "task=something different" -X PUT -v
Or you can do same thing in Postman, POST to add new task and PUT to update a task.

7 Great Utility Libraries for Data Visualization With JavaScript

JavaScript runs the web. You can use it in a browser, you can use it on a server, and you can use it for mobile applications.
Today’s ecosystem is full of great libraries and frameworks helping engineers build powerful, user-centric applications for any platform.
Data visualization has been one of the hottest topics in the world right now, even before the Covid-19 pandemic. Companies sit on massive amounts of data and need to find ways to analyze, interpret, and visualize that data.
Whether you’re a data scientist or a programmer that has to deal with data visualization, here are seven great JavaScript frameworks to help you create stunning solutions.

  1. D3
    https://github.com/d3/d3
    D3 currently has 90,000 stars on GitHub, making it one of the most popular JavaScript libraries available. It’s an amazing library for visualizing data with JavaScript using web standards (SVG, Canvas, HTML). It combines powerful interaction and visualization techniques to manipulate the DOM with a data-driven approach.
    It allows for binding arbitrary data to the DOM and then applying transformations to the document.
    Key features are:
    Full capabilities of web standards
    Extremely fast and supports large datasets
    Official and community-developed modules available
  2. three.js
    https://github.com/mrdoob/three.js
    three.js is another great JavaScript library for data visualization that currently has about 60,000 GitHub stars. It wants to create an easy-to-use, simple, and lightweight 3D library with a default WebGL renderer.
    Key features are:
    Default WebGL renderer
    Supports renderer for Canvas 2D, SVG, and CSS3D
    Good documentation
  3. Chart.js

Chart.js is a simple but flexible JavaScript-charting library for designers and developers that has about 50,000 stars on GitHub at the moment. It has great documentation, and it’s pretty easy to get started.
Key features:
Mixed chart types
Out-of-the-box stunning transitions
Open-source project
Supports eight chart types
Responsive

  1. Paper.js

Paper.js is an open-source vector graphic–scripting framework running on the top of HTML5 Canvas. It offers a lot of powerful functionality to create and work with Bézier curves and vector graphics. It’s based on Scriptographer, a scripting environment for Adobe Illustrator. Paper.js is easy to learn for beginners but also has a lot of advanced features for advanced users.
Key features:
Easy to get started with
Well-designed and battle-hardened API
Based on Scriptographer, using HTML5 standards
It offers nested layers, groups, paths, compound paths, rasters, symbols, etc.

  1. Fabric.js

Fabric.js is a great JavaScript framework for working with HTML canvas elements easily. It has both an interactive object model on top of the canvas element and an SVG-to-canvas parser.
With Fabric, one can easily create simple shapes, like circles, triangles, rectangles, or other polygons, using JavaScript.
Key features:
Unit tested
Modular architecture
Cross-browser functionality
It’s fast and follows semantic versioning

  1. ECharts

ECharts is a powerful visualization and charting library for JavaScript that offers easy ways of adding interactive, intuitive, and highly customizable charts to applications and currently has about 40,000 stars on GitHub. It’s based on ZRender and written in pure JavaScript.
Key features:
Incubator project of the Apache Software Foundation
Free to use
Supports multidimensional data analysis
Active community
Charts for all sizes of devices

  1. Two.js

Two.js is a small API for two-dimensional drawing in modern browsers. It’s renderer-agnostic, enabling rendering in multiple contexts, such as WebGL, Canvas2D, or SVG, with the same API.
Key features:
Focus on vector shapes
Relies on scene graphs
Built-in animation loop
Features a scalable vector-graphics interpreter

WAS8.5 add a new DS timestampPrecisionReporting custom properties

cmd:
./wsadmin.sh -lang jython -f ds2.py TEST_CL

cluster = sys.argv[0]
ds =  AdminConfig.getid('/ServerCluster:'+cluster+'/JDBCProvider:DB2 Universal JDBC Driver Provider/DataSource:web DataSource')
propSet = AdminConfig.showAttribute( ds, 'propertySet' )
print propSet
print AdminConfig.required('J2EEResourceProperty')
url_attr = [ [ 'name', 'timestampPrecisionReporting'  ], [ 'value', 2 ], [ 'type', 'java.lang.String' ], [ 'required', 'false' ] ]
print AdminConfig.create('J2EEResourceProperty', propSet, url_attr)
AdminConfig.save()

WAS8.5 change DS webSphereDefaultIsolationLevel

in DMGR bin dir:
for i in cat tt; do ./wsadmin.sh -lang jython -f ds.py $i;done

tt:
Web1_CL
Web2_CL
Web3_CL

cluster = sys.argv[0]
ds =  AdminConfig.getid('/ServerCluster:'+cluster+'/JDBCProvider:DB2 Universal JDBC Driver Provider/DataSource:webDataSource')
propSet = AdminConfig.showAttribute( ds, 'propertySet' )
resProps = AdminConfig.showAttribute( propSet, 'resourceProperties' )
rsPropList = resProps[ 1:-1 ].split()
for prop in rsPropList :
    if ( prop.find( 'webSphereDefaultIsolationLevel' ) > -1 ) :
      urlValue = AdminConfig.showall( prop, 'value' )
      url_attr = [ [ 'name', 'webSphereDefaultIsolationLevel'  ], [ 'value', 2 ], [ 'type', 'java.lang.String' ], [ 'required', 'false' ] ]
      AdminConfig.modify( prop, url_attr )
      AdminConfig.save()

WAS8.5 add or modify JVM custom property

Usage: wsadmin -f addJvmProperty.py server property value [description]

# Usage: wsadmin -f addJvmProperty.py server property value [description]
# Usage: ./wsadmin.sh -f addJvmProperty.py jvm_name org.oasis.open.docs.wsn_HOSTNAME soa.sys.com

server = sys.argv[0]
property = sys.argv[1]
value = sys.argv[2]
if (len(sys.argv) == 4):
    descr = sys.argv[3]
else :
    descr = None

# Convert a list of items separated by linefeeds into an array
def getListArray(l):
    return l.splitlines()

# Obtain the "simple" server name
def getServerName(s):
    return AdminConfig.showAttribute(s, 'name')

# Add common attr list to specified Server's JVM
def addPropertiesToServer(s):
    jvm = AdminConfig.list('JavaVirtualMachine', s)

    # Look for existing property so we can replace it (by removing it first)
    currentProps = getListArray(AdminConfig.list("Property", jvm))
    for prop in currentProps:
        if property == AdminConfig.showAttribute(prop, "name"):
            print "Removing existing property from Server %s" % getServerName(s)
            AdminConfig.remove(prop)

    # Store new property in 'systemProperties' object
    print "Adding property to Server %s" % getServerName(s)
    AdminConfig.modify(jvm,[['systemProperties',attr]])

# Construct list with new property name and value
attr = []

if (descr is None):
    print "Adding property %s=%s" % (property,value)
    attr.append([['name',property],['value',value]])
else:
    print "Adding property %s=%s,%s" % (property,value,descr)
    attr.append([['name',property],['value',value],['description',descr]])

# Locate all Application Servers if server is 'all'
if (server == 'all'):
    servers = AdminConfig.list('Server')
    for aServer in getListArray(servers):
        type = AdminConfig.showAttribute(aServer,'serverType')
        if (type == 'APPLICATION_SERVER'):
            addPropertiesToServer(aServer)

# TODO: support comma-separated list of servers

else:
    # Locate specified Server and its JVM
    server = AdminConfig.getid('/Server:'+server+'/')
    addPropertiesToServer(server)

# Save changes
if (AdminConfig.hasChanges()):
    AdminConfig.save()