Code, Linux, python

Instead of eliminating the GIL, Python should work around it

Note: In the following text Python refers to CPython

Python is a great language. With everything it has going for it, it has one big hairy wart – the Global Interpreter Lock. The GIL is a mutex that prevents multiple threads from running Python code at the same time. Unless your program uses a C extension that releases the GIL for an extended period of time, your threads will do nothing but wait for the GIL to become unlocked until it is their turn to run. I wrote a quick test that shows, even though the same work is divided into 25 threads it runs as the same speed as 1 thread.

This test was run with Python 3.6 in 2018. Each instance of the loop ran between .04 and .06 seconds with the single threaded and multi threaded code taking turns at being the fastest depending on the iteration.

Even though the threaded code runs in the same amount of time, it uses vastly more CPU time. When the OS tries to run the threads that don’t have the GIL they spin in a while-sleep loop waiting for their turn to hold the GIL.

Threads in Python are real system threads. They have all the overhead of threads and very few of the benefits. Before Asyncio, I/O was the only thing in threads in Python were really good for. You could have a reader read from one socket and a writer write to another socket. The threads never ran at exactly the same time. But your program wasn’t deadlocked until some new data came in or the socket timed out. A core Python developer once wrote:

The GIL’s effect on the threads in your program is simple enough that you can write the principle on the back of your hand: “One thread runs Python, while N others sleep or await I/O.”

You could have 50 threads, and at any given time the operating system can try to run one of them. 98% of the time, any given thread would just sit in a while loop waiting for the GIL to become unlocked, wasting clock cycles and accomplishing nothing. Threads in Python are real system threads.  They get scheduled separately but they will never run at the same time unless you are using a C extension that releases the GIL. Python threads have all the overhead of real threads with few of the benefits.

With the advent of Asyncio, threads in Python are now the wrong tool for the job in many situations. Ayncio is able to run multiple tasks on the same thread. Because of this, the CPU caches stay fresh, context switching happens much less, there is no need for locking to maintain state (although it still happens), and locked threads are not stuck in a sleep-while loop waiting for the GIL to become unlocked.

You get all the benefits of threading in Python without the performance hit of constant context switching.  During context switching CPUs invalidate their caches.  The L1 cache is about 100x faster than system memory and the L2 cache is about 25x faster.

Because of this, it seems that Asyncio should make CPU bound tasks faster as well or at least as fast as their multithreaded counterparts. In reality, the same code I posted above runs 3x-5x slower when you throw it on an event loop with Asyncio. Though keep in mind this code abuses async and await in ways that make no sense in the real world.

Asyncio will make IO bound tasks fast but it won’t make your CPU bound code any faster. Today, the multiprocessing module is what many programmers reach for to solve this problem. It uses processes to accomplish what many other languages do with threads. It forks the program allowing two instances of the interpreter to run at once. The downside to this is that all data has to be pickled and you have to deal with the overhead of multiple processes. You can light up every CPU core on your system but passing objects is significantly harder than it is with threads. Usually you end up passing dicts when you really wanted to pass a class.

There have been attempts to remove the GIL. The Gilectomy was probably the most well known and recent of these efforts. It’s last commit was in 2016. The project appears to be abandoned now. The creator of this project gave two talks on what was involved. The GIL  had to be replaced with hundreds of locks all over the CPython code. The garbage collector needed major changes. They struggled to get performance to where we are with Python’s current threading implementation. Many if not most C extensions would break.

Python is incredibly dynamic. They were trying to make it possible that any thread could do any of the things Python is known for. Monkey patch a module or change the global state from 5 threads at once. These are features that most people would probably never use. In the real world threading is usually done by giving input, doing work and producing output. Global state is usually read only. Most programs keep global state as read only. The ones that don’t probably should.

Most of the time you take input, either from a key press, a web request, or command line arguments and it bubbles up the stack to run the program. I’ve never used the global keyword in a production program. Programs that write to a global state are very hard to multithread even with the GIL. You can never predict what order the OS will allow your threads to run. You can only lock other threads out of using resources.

I would love if we could get a new threading primitive that has no GIL and no ability to change global state unless the GIL is explicitly requested. It could modify global state by using a decorator that seized the GIL or maybe by using a keyword. It could do all of it’s number crunching on it’s own stack and the 3 lines that need to modify the UI would be the only ones that were decorated. Nothing would need to be pickled. You could pass real objects but any modifications to them would either throw an exception unless they were done in the GIL decorator. This would be a lot like what you get from multiprocessing today with the ability to modify global state when you need to, without the overhead of processes, and without the need to pickle everything.

Something like this appears to be coming with subinterpreters. This will allow developers to create a blank slate that won’t be bound by the GIL. Hopefully this evolves into something like what I’ve described above.

Advertisements
KDE, Linux, OpenSuse, Uncategorized

Compiz works great with KDE 5

Screenshot_20170530_193645.png
Compiz was pretty much “the thing” that got me started with Linux. Whenever it comes up on online forums, there are usually a few people who say the same thing. When KDE 5 and Gnome 3 made their own compositors and window managers a lot the Compiz functionality was replaced. Most people seem to be content with this and Compiz was regarded as a thing of the past. When KDE 5 and Gnome 3 first came out Compiz was completely incompatible. Now it is possible to get Compiz running on both.

I’ve always felt that Compiz offered a lot more and was generally more useful than the KDE effects. Today I found out, it really isn’t that hard to get Compiz working in KDE 5 with Plasma.

I’m using Open SUSE but you should be able to find these packages in whatever distribution you are using.
compiz
compiz-emerald
compiz-emerald-themes
compiz-plugins
compizconfig-settings-manager
fusion-icon

Compiz is the window manager, Emerald is the window decorator, Compizconfig Settings Manager is the configuration tool, and Fusion Icon sets everything up.

You’ll want to disable KDE’s desktop effects. Search “Compositor” in your application menu and disable “Enable compositor on startup”.

Search “Autostart” in your application menu. Add fusion-icon to run on startup. Run fusion-icon from a terminal or the applications menu, and you should be able to change your window manager to Compiz. Right click the Fusion icon and choose “Select Window Manager”. Once that is set, it will replace your window manager on startup. From the Fusion Icon, you can set your Emerald Theme and Compiz settings.

If you log back in and you don’t have a desktop, or your desktop is blank, this is easy to fix. The blank desktop issue seems to happen if Compiz loads too fast. Replacing the autostart fusion-icon command with a script with the contents sleep 5; fusion-icon seems to give KDE enough time to load the desktop before Compiz loads. Compiz still starts while KDE is loading, so you don’t see a hacky switch in window managers 5 seconds into your desktop session.

If you want to use the “Windows Previews” plugin in Compiz, you may see two window previews when you hover over your task manager if KDE’s window previews are turned on. To disable this, right click your task manager, click “Task Manager Settings” and uncheck “Show Previews”.

So why even use Compiz? One of the main features for me is, just by holding down shift while switching desktops, I can bring the window with me while moving to different sides of the cube. I’ve never been able to find a way to do this in KDE. There are also a lot more features, plugins, and themes.

A lot of it seems frozen in time. A lot of the Emerald themes I remember from 10 years ago. But they still work fine

I think Compiz was good for the Linux community. It got a lot of people talking about Linux and a lot of people using Linux. It is kind of unfortunate that it was shut out by the big two desktop environments. When Compiz was popular, I remember seeing new plugins in the Compiz settings manager every few weeks. It has been years since KDE 5 was released and there are hardly any plugins for it’s “Desktop Effects”.

I know a lot of Linux users dismiss Compiz as pointless “bling”. Even if this was true, people were sharing Compiz videos and people were trying Linux just for Compiz. I think it would of been better if Gnome and KDE didn’t shut Compiz out.

Whatever the issue was with KDE 5, it seems to be fixed. With Gnome, even though most online posts say it is impossible to run Compiz, it has been reported to work if you start Gnome in “fallback mode”.

Elementary OS, Linux, Uncategorized

Updates to Relay

I pushed out some updates to Relay. You can find the changes on Github or Launchpad. Relay is an elegant and sleek IRC client designed for Elementary OS but will work on any Linux OS.

Relay will try to switch to a theme that looks good. You can now disable this by passing the -t option.

I also added better Unicode support and fixed an issue that was causing it to close prematurely.

Here is what Relay looks like. Its one of the nicest looking IRC clients out there.

Screenshot from 2015-07-04 13:52:24

Elementary OS, Linux, Ubuntu

Create BTRFS Snapshots With Each apt-get Transaction

So I took it upon myself to fork apt-btrfs-snapshot. It is a program that takes BTRFS snapshots after each apt transaction. I wanted it to use Snapper because Snapper has a GUI. Snapper also abstracts all of the functionality of working with BTRFS snapshots.

Here are some of the things its provides:

  • Management via a GUI
  • Rollbacks without mounting anything
  • A list of what files were changed and their filesizes
  • Tracking of what packages were installed
  • Pre and post snapshots
  • Automatic clean up

You can check it out on github:
https://github.com/agronick/apt-btrfs-snapper

64bit .deb

Ubuntu 14.04 Ubuntu 14.10 Ubuntu 15.04
32bit .deb 32bit .deb 32bit .deb
64bit .deb 64bit .deb 64bit .deb

You can use a tool called gdebi to grab all the dependencies you need, which are only really Python and Ssnapper. If you want this done for you run
gdebi apt-btrfs-snapper_0.4.1_all.deb

Management Via a GUI

You can check out this post on installing Snapper GUI on Ubuntu. As you can see below you get a list of all your snapshots and in the bottom you can see what packages were installed. If you hold down control you can select two snapshots and open up the changes view to see what files were changed.

Snapper-GUI on Ubuntu
Snapper-GUI on Ubuntu

ROLLBACKS WITHOUT MOUNTING ANYTHING

To rollback to a previous version you just type:
sudo apt-btrfs-snapper --restore .
Replace <ID> with the snapshot ID or the snapshot name. This will delete, create, and modify your files to get your machine back in the state that it was in when that snapshot was created. You can then roll forward in time just by using a newer ID. You don’t need to restart anything.

A LIST OF FILES THAT WERE CHANGED AND THEIR FILESIZES

You can get a list of snapshots with:
sudo apt-btrfs-snapper list
You can then see what files were changed between two snapshots with:
sudo apt-btrfs-snapper diff

Here is a sample of that output:

c   391 B      /usr/share/doc/maya-calendar-plugin-caldav/changelog.gz
c   391 B      /usr/share/doc/maya-calendar-plugin-google/changelog.gz
c   391 B      /usr/share/doc/maya-calendar/changelog.gz
c   542 B      /usr/share/doc/pantheon-files/changelog.gz
c   246 B      /usr/share/doc/pantheon-photos-common/changelog.Debian.gz
c   246 B      /usr/share/doc/pantheon-photos/changelog.Debian.gz
c   854 B      /usr/share/doc/plank/changelog.Debian.gz 

You can use snapper itself to restore an individual file to a specific state.

Tracking of what packages were installed

apt-btrfs-snapper saves the names of all the packages that were installed in the user data of each snapshot. The best place to view this is in snapper-gui. It can be viewed in the snapper command line tools but it is hard to read. You can see this in the bottom pane in the screenshot above.

Pre and post snapshots

apt-btrfs-snapper takes a snapshot before and after each transaction. They are grouped together in snapper-gui. You can easily see what changes took place between the two snapshots.

Automatic clean up

One of the best parts about snapper are the clean up algorithms built into it. apt-btrfs-snapper simply uses the configuration settings set for the number cleanup algorithm which is part of snapper.

So check it out. Its stable, works great, and makes taking and manipulating BTRFS snapshots a lot easier.

Linux

Installing Snapper-GUI on Ubuntu: A GUI for BTRFS Snapshots

Snapper GUI is a great program and one you absolutely need if you are using Snapper on a desktop. Snapper is a program that helps manage snapshots on the btrfs filesystem. This quick guide will go over how to install it on Ubuntu.

Snapper-GUI on Ubuntu
Snapper-GUI on Ubuntu

Run the following in a terminal.

First install the packages you will need to run:
sudo apt-get install python3 libgtksourceview-3.0-1 python3 python3-dbus python3-setuptools git

Then clone the snapper-gui GIT repo somewhere:
git clone --depth=1 https://github.com/ricardomv/snapper-gui.git

cd into the snapper-gui folder GIT created and run:
sudo python3 setup.py install

Now run the program:
snapper-gui

If you haven’t made a config with snapper first run:
snapper create-config /

Now that you have it installed you can use apt-btrfs-snapper to take a snapper snapshot every time you do an apt-get transaction.

Bash Scripts, Elementary OS, Linux

Updated Desktop Slideshow script for ElementaryOS

ElementaryOS logo A few days ago I released a desktop wallpaper slideshow script for ElementaryOS. A user pointed out that it wasn’t changing the login screen wallpaper. I added a fix and now your login screen will have a random background; the same one as the desktop slideshow. If there is a big demand for them to be independant of eachother I may make the desktop slideshow differ from the login screen.

You can still use the -bootonly flag to only set only one random wallpaper once when you log in to ElementaryOS. This will now also change your login screen’s wallpaper.

If you rather not change the login screen background from the default ElementaryOS one you can use the –nologin flag.

To change the login screen you will need qdbus. You can install it with apt-get install qdbus.

I added a bunch of logging which is useful if you give the desktop slideshow script a large number of files to work with. Occasionally you may see an x on your desktop indicating that an image couldn’t load. You can then check the logs with tail -f /var/log/syslog and see what image is giving you issues. Then you can delete it or move it. You must enable logging with the –log flag for this to work.

As always you can get the wallpaper slideshow script from Github. Check out the last post for more information on installing and running the Wallpaper Slideshow script. Let me know if you encounter any issues. Its always good to hear feedback.

Bash Scripts, Linux

Get the size of your BTRFS Snapshots

If you want to get the size of your BTRFS snapshots you would probably use btrfs qgroup show.  This only shows you a list of IDs and the sizes are in bytes. I wrote a script that will convert the sizes from bytes to kilobytes, megabytes or gigabytes. It will combine the IDs with the name of each snapshot or subvolume from btrfs subvolume list to make each row a lot more meaningful.

In the end instead of seeing a list like this:
Screenshot from 2015-05-26 15:47:24

You’ll see:

Detailed information of each BTRFS snapshot
Detailed information of each BTRFS snapshot

Instead of meaningless IDs you now have the name of your BTRFS subvolumes or snapshots. Instead of a hard to decipher string of bytes it converts each amount into the most appropriate unit of measurement. You can also see the total amount of data that is being used by the snapshots.

For this to work you first need to enable  quotas. Run this command to enable quotas if you haven’t done so already:

sudo btrfs quota enable /

You can clone the project from github by running:
git clone https://github.com/agronick/btrfs-size.git

Or you can just go a wget on the script:
wget https://raw.githubusercontent.com/agronick/btrfs-size/master/btrfs-size.sh

Set it to executable with:
chmod +x ./btrfs-size.sh

Now you can just run the script with: ./btrfs-size.sh

All the columns are pretty self explanatory. The Total column will tell you how much data is in that BTRFS subvolume. The Exclusive Data column is how much data is exclusive to that subvolume. Since BTRFS is a “copy on write” filesystem none of the data is replicated when you create a snapshot. It only needs to make a copy when something changes.

Leave your feedback here to let me know how it worked for you.