sql >> Database teknologi >  >> NoSQL >> Redis

Sådan bruges Redis med Python

I denne tutorial lærer du, hvordan du bruger Python med Redis (udtales RED-iss, eller måske REE-diss eller Red-DEES, afhængigt af hvem du spørger), som er en lynhurtig nøgleværdi-lager i hukommelsen, der kan bruges til alt fra A til Z. Her er hvad Syv databaser på syv uger , en populær bog om databaser, har at sige om Redis:

Det er ikke bare nemt at bruge; det er en glæde. Hvis en API er UX for programmører, så burde Redis være i Museum of Modern Art sammen med Mac Cube.

Og når det kommer til fart, er Redis svær at slå. Læsning er hurtig, og skrivning er endnu hurtigere, og håndterer op mod 100.000 SET operationer per sekund af nogle benchmarks. (Kilde)

fascineret? Denne tutorial er bygget til Python-programmøren, som måske har nul til lidt Redis-erfaring. Vi vil tackle to værktøjer på én gang og introducere både Redis selv og et af dets Python-klientbiblioteker, redis-py .

redis-py (som du kun importerer som redis ) er en af ​​mange Python-klienter til Redis, men den udmærker sig ved at blive faktureret som "i øjeblikket vejen at gå for Python" af Redis-udviklerne selv. Det lader dig kalde Redis-kommandoer fra Python og få velkendte Python-objekter tilbage til gengæld.

I dette selvstudie dækker du :

  • Installation af Redis fra kilden og forståelse af formålet med de resulterende binære filer
  • Lær et lille stykke af Redis selv, inklusive dets syntaks, protokol og design
  • Master redis-py mens den også ser glimt af, hvordan den implementerer Redis' protokol
  • Opsætning og kommunikation med en Amazon ElastiCache Redis-serverinstans

Gratis download: Få et eksempelkapitel fra Python Tricks:The Book, der viser dig Pythons bedste praksis med enkle eksempler, som du kan anvende med det samme for at skrive smukkere + Pythonic kode.


Installation af Redis fra kilde

Som min tipoldefar sagde, intet bygger grus som at installere fra kilden. Dette afsnit vil guide dig gennem download, fremstilling og installation af Redis. Jeg lover, at dette ikke vil gøre ondt!

Bemærk :Dette afsnit er orienteret mod installation på Mac OS X eller Linux. Hvis du bruger Windows, er der en Microsoft-gaffel af Redis, der kan installeres som en Windows-tjeneste. Det er tilstrækkeligt at sige, at Redis som program lever mest komfortabelt på en Linux-boks, og at opsætning og brug på Windows kan være kræsen.

Først skal du downloade Redis-kildekoden som en tarball:

$ redisurl="http://download.redis.io/redis-stable.tar.gz"$ curl -s -o redis-stable.tar.gz $redisurl 

Skift derefter til root og udtræk arkivets kildekode til /usr/local/lib/ :

$ sudo su root$ mkdir -p /usr/local/lib/$ chmod a+w /usr/local/lib/$ tar -C /usr/local/lib/ -xzf redis-stable. tar.gz 

Du kan nu eventuelt fjerne selve arkivet:

$ rm redis-stable.tar.gz 

Dette vil efterlade dig med et kildekodelager på /usr/local/lib/redis-stable/ . Redis er skrevet i C, så du bliver nødt til at kompilere, linke og installere med make hjælpeprogram:

$ cd /usr/local/lib/redis-stable/$ lav &&lav installation 

Brug make install udfører to handlinger:

  1. Det første fabrikat kommando kompilerer og linker kildekoden.

  2. make install del tager de binære filer og kopierer dem til /usr/local/bin/ så du kan køre dem hvor som helst (forudsat at /usr/local/bin/ er i PATH ).

Her er alle trinene indtil videre:

$ redisurl="http://download.redis.io/redis-stable.tar.gz"$ curl -s -o redis-stable.tar.gz $redisurl$ sudo su root$ mkdir - p /usr/local/lib/$ chmod a+w /usr/local/lib/$ tar -C /usr/local/lib/ -xzf redis-stable.tar.gz$ rm redis-stable.tar.gz$ cd /usr/local/lib/redis-stable/$ lav &&lav installation 

På dette tidspunkt skal du bruge et øjeblik på at bekræfte, at Redis er i din PATH og tjek dens version:

$ redis-cli --versionredis-cli 5.0.3 

Hvis din shell ikke kan finde redis-cli , tjek for at sikre, at /usr/local/bin/ er på din PATH miljøvariabel, og tilføj den, hvis ikke.

Ud over redis-cli , foretag installation fører faktisk til, at en håndfuld forskellige eksekverbare filer (og et symlink) placeres på /usr/local/bin/ :

$ # Et øjebliksbillede af eksekverbare filer, der følger med Redis$ ls -hFG /usr/local/bin/redis-* | sort/usr/local/bin/redis-benchmark*/usr/local/bin/redis-check-aof*/usr/local/bin/redis-check-rdb*/usr/local/bin/redis-cli*/ usr/local/bin/redis-sentinel@/usr/local/bin/redis-server* 

Selvom alle disse har en tilsigtet brug, er de to, du sandsynligvis vil bekymre dig mest om, redis-cli og redis-server , som vi kort skitserer. Men før vi kommer til det, er det i orden at opsætte nogle basislinjekonfigurationer.



Konfiguration af Redis

Redis er meget konfigurerbar. Selvom det kører fint ud af boksen, så lad os tage et øjeblik på at indstille nogle bare-bones-konfigurationsmuligheder, der relaterer til databasepersistens og grundlæggende sikkerhed:

$ sudo su root$ mkdir -p /etc/redis/$ touch /etc/redis/6379.conf 

Skriv nu følgende til /etc/redis/6379.conf . Vi vil dække, hvad de fleste af disse betyder gradvist gennem selvstudiet:

# /etc/redis/6379.confport 6379daemonize yessave 60 1bind 127.0.0.1tcp-keepalive 300dbfilnavn dump.rdbdir ./rdbcompression ja 

Redis-konfigurationen er selvdokumenterende med eksemplet redis.conf fil placeret i Redis-kilden til din læseglæde. Hvis du bruger Redis i et produktionssystem, kan det betale sig at blokere alle distraktioner og tage dig tid til at læse denne eksempelfil i sin helhed for at sætte dig ind i Redis's ins og outs og finjustere din opsætning.

Nogle tutorials, inklusive dele af Redis' dokumentation, kan også foreslå at køre Shell-scriptet install_server.sh placeret i redis/utils/install_server.sh . Du er i hvert fald velkommen til at køre dette som et mere omfattende alternativ til ovenstående, men noter dig et par finere punkter om install_server.sh :

  • Det virker ikke på Mac OS X – kun Debian og Ubuntu Linux.
  • Det vil injicere et mere komplet sæt konfigurationsmuligheder i /etc/redis/6379.conf .
  • Den vil skrive en System V init script til /etc/init.d/redis_6379 som vil lade dig udføre sudo service redis_6379 start .

Redis-hurtigstartguiden indeholder også et afsnit om en mere korrekt Redis-opsætning, men konfigurationsmulighederne ovenfor burde være helt tilstrækkelige til denne vejledning og til at komme i gang.

Sikkerhedsbemærkning: For et par år tilbage påpegede forfatteren af ​​Redis sikkerhedssårbarheder i tidligere versioner af Redis, hvis der ikke var indstillet nogen konfiguration. Redis 3.2 (den nuværende version 5.0.3 fra marts 2019) lavede trin for at forhindre denne indtrængen ved at indstille beskyttet-tilstand mulighed for at ja som standard.

Vi indstiller eksplicit bind 127.0.0.1 at lade Redis kun lytte efter forbindelser fra den lokale værtsgrænseflade, selvom du bliver nødt til at udvide denne hvidliste i en rigtig produktionsserver. Pointen med beskyttet tilstand er som en sikkerhedsforanstaltning, der vil efterligne denne binding-til-localhost-adfærd, hvis du ikke ellers angiver noget under bind mulighed.

Med det i kvadrat, kan vi nu grave i at bruge Redis selv.



Ti minutter til Redis

Denne sektion vil give dig lige nok viden om Redis til at være farlig, og skitsere dets design og grundlæggende brug.


Kom godt i gang

Redis har en klient-server-arkitektur og bruger en request-response model . Det betyder, at du (klienten) opretter forbindelse til en Redis-server via TCP-forbindelse, på port 6379 som standard. Du anmoder om en handling (som en form for læsning, skrivning, hentning, indstilling eller opdatering), og serveren tjener du bakker et svar.

Der kan være mange klienter, der taler til den samme server, hvilket i virkeligheden er, hvad Redis eller enhver klient-server-applikation handler om. Hver klient foretager en (typisk blokerende) læsning på en socket og venter på serversvaret.

cli i redis-cli står for kommandolinjegrænseflade , og serveren i redis-server er til at køre en server. På samme måde som du ville køre python på kommandolinjen kan du køre redis-cli at hoppe ind i en interaktiv REPL (Read Eval Print Loop), hvor du kan køre klientkommandoer direkte fra skallen.

Først skal du dog starte redis-server så du har en kørende Redis-server at tale med. En almindelig måde at gøre dette på i udviklingen er at starte en server på localhost (IPv4-adresse 127.0.0.1 ), som er standard, medmindre du fortæller Redis andet. Du kan også sende redis-server navnet på din konfigurationsfil, som er beslægtet med at angive alle dens nøgleværdi-par som kommandolinjeargumenter:

$ redis-server /etc/redis/6379.conf31829:C 07 Mar 2019 08:45:04.030 # oO0OoO0OoO0Oo Redis starter oO0OoO0OoO0Oo31829:C 05. Marts 040.0:419:C 07. Mar 040.0. 3, bits=64, commit=00000000, modified=0, pid=31829, just started31829:C 07 Mar 2019 08:45:04.030 # Konfiguration indlæst 

Vi indstiller daemonize konfigurationsmulighed til yes , så serveren kører i baggrunden. (Ellers skal du bruge --dæmonisere ja som en mulighed for at redis-server .)

Nu er du klar til at starte Redis REPL. Indtast redis-cli på din kommandolinje. Du vil se serverens host:port par efterfulgt af en > prompt:

127.0.0.1:6379>

Her er en af ​​de enkleste Redis-kommandoer, PING , som blot tester forbindelsen til serveren og returnerer "PONG" hvis tingene er i orden:

127.0.0.1:6379> PINGPONG 

Redis-kommandoer er store og små bogstaver, selvom deres Python-modstykker absolut ikke er det.

Bemærk: Som endnu et fornuftstjek kan du søge efter proces-id'et for Redis-serveren med pgrep :

$ pgrep redis-server26983 

For at dræbe serveren, brug pkill redis-server fra kommandolinjen. På Mac OS X kan du også bruge redis-cli shutdown .

Dernæst vil vi bruge nogle af de almindelige Redis-kommandoer og sammenligne dem med, hvordan de ville se ud i ren Python.



Redis som en Python-ordbog

Redis står for Remote Dictionary Service .

"Du mener, som en Python-ordbog?" kan du spørge.

Ja. I store træk er der mange paralleller, du kan drage mellem en Python-ordbog (eller generisk hash-tabel) og hvad Redis er og gør:

  • En Redis-database indeholder nøgle:værdi parrer og understøtter kommandoer såsom GET , SET og DEL , samt flere hundrede yderligere kommandoer.

  • Redis nøgler er altid strenge.

  • Redis værdier kan være en række forskellige datatyper. Vi vil dække nogle af de mere essentielle værdidatatyper i denne øvelse:streng , liste , hashes og sæt . Nogle avancerede typer inkluderer geospatiale elementer og den nye strømtype.

  • Mange Redis-kommandoer fungerer i konstant O(1)-tid, ligesom at hente en værdi fra en Python dict eller en hvilken som helst hash-tabel.

Redis-skaberen Salvatore Sanfilippo ville sandsynligvis ikke elske sammenligningen af ​​en Redis-database med en almindelig vanilje Python dict . Han kalder projektet for en "datastrukturserver" (snarere end et nøgleværdilager, såsom memcached), fordi Redis til sin ære understøtter lagring af yderligere typer nøgle:værdi datatyper udover string:string . Men til vores formål her, er det en nyttig sammenligning, hvis du er bekendt med Pythons ordbogsobjekt.

Lad os springe ind og lære ved eksempel. Vores første legetøjsdatabase (med ID 0) vil være en kortlægning af land:hovedstad , hvor vi bruger SET for at indstille nøgleværdi-par:

127.0.0.1:6379> SET Bahamas NassauOK127.0.0.1:6379> SET Croatia ZagrebOK127.0.0.1:6379> GET Croatia"Zagreb"127.0.0.1:6379> 

Den tilsvarende sekvens af udsagn i ren Python ville se sådan ud:

>>>
>>> hovedstæder ={}>>> hovedstæder["Bahamas"] ="Nassau">>> hovedstæder["Kroatien"] ="Zagreb">>> capitals.get(" Kroatien")'Zagreb'>>> capitals.get("Japan") # Ingen 

Vi bruger capitals.get("Japan") i stedet for hovedstæder["Japan"] fordi Redis returnerer nul snarere end en fejl, når en nøgle ikke findes, hvilket er analogt med Pythons Ingen .

Redis giver dig også mulighed for at indstille og få flere nøgleværdi-par i én kommando, MSET og MGET , henholdsvis:

127.0.0.1:6379> MSET Libanon Beirut Norge Oslo Frankrig ParisOK127.0.0.1:6379> MGET Libanon Norge Bahamas1) "Beirut"2) "Oslo"3) "Nassau" 

Det nærmeste i Python er dict.update() :

>>>
>>> capitals.update({... "Lebanon":"Beirut",... "Norge":"Oslo",... "Frankrig":"Paris", ... })>>> [capitals.get(k) for k in ("Libanon", "Norge", "Bahamas")]['Beirut', 'Oslo', 'Nassau'] 

Vi bruger .get() i stedet for .__getitem__() at efterligne Redis' adfærd med at returnere en nul-lignende værdi, når der ikke findes nogen nøgle.

Som et tredje eksempel er EXISTS kommandoen gør, hvad det lyder som, hvilket er at kontrollere, om der findes en nøgle:

127.0.0.1:6379> FINDER Norge(heltal) 1127.0.0.1:6379> EKSISTER Sverige(heltal) 0 

Python har in nøgleord for at teste det samme, som leder til dict.__contains__(key) :

>>>
>>> "Norge" i versalerTrue>>> "Sverige" i versalerFalse 

Disse få eksempler er beregnet til at vise, ved hjælp af native Python, hvad der sker på et højt niveau med nogle få almindelige Redis-kommandoer. Der er ingen klient-server-komponent her til Python-eksemplerne, og redis-py er endnu ikke kommet ind i billedet. Dette er kun beregnet til at vise Redis funktionalitet ved eksempel.

Her er en oversigt over de få Redis-kommandoer, du har set, og deres funktionelle Python-ækvivalenter:

capitals["Bahamas"] ="Nassau" 

capitals.get("Kroatien") 

capitals.update( { "Lebanon":"Beirut", "Norge":"Oslo", "Frankrig":"Paris", }) 

[hovedstæder[k] for k in ("Libanon", "Norge", "Bahamas")] 

"Norge" med versaler 

Python Redis-klientbiblioteket, redis-py , som du snart vil dykke ned i i denne artikel, gør tingene anderledes. Den indkapsler en faktisk TCP-forbindelse til en Redis-server og sender råkommandoer, som bytes serialiseret ved hjælp af REdis Serialization Protocol (RESP), til serveren. Det tager derefter det rå svar og parser det tilbage til et Python-objekt såsom bytes , int , eller endda datetime.datetime .

Bemærk :Indtil videre har du talt med Redis-serveren gennem den interaktive redis-cli REPL. Du kan også udstede kommandoer direkte, på samme måde som du ville videregive navnet på et script til python eksekverbar, såsom python myscript.py .

Indtil videre har du set et par af Redis' grundlæggende datatyper, som er en kortlægning af string:string . Selvom dette nøgle-værdi-par er almindeligt i de fleste nøgleværdi-butikker, tilbyder Redis en række andre mulige værdityper, som du vil se næste gang.



Flere datatyper i Python vs Redis

Før du starter redis-py Python-klient, hjælper det også at have et grundlæggende greb om et par flere Redis-datatyper. For at være klar, er alle Redis-nøgler strenge. Det er den værdi, der kan antage datatyper (eller strukturer) ud over de strengværdier, der er brugt i eksemplerne indtil videre.

En hash er en tilknytning af streng:streng , kaldet feltværdi par, der sidder under én tast på øverste niveau:

127.0.0.1:6379> HSET realpython url "https://realpython.com/"(integer) 1127.0.0.1:6379> HSET realpython github realpython(integer) 1127.0.0.1:6379> HSET fulde navn "Real Python"(heltal) 1 

Dette indstiller tre felt-værdi-par for én nøgle , "realpython" . Hvis du er vant til Pythons terminologi og objekter, kan dette være forvirrende. En Redis-hash er nogenlunde analog med en Python dict der er indlejret et niveau dybt:

data ={ "realpython":{ "url":"https://realpython.com/", "github":"realpython", "fullname":"Real Python", }} 

Redis' felter er beslægtet med Python-nøglerne for hvert indlejret nøgleværdi-par i den indre ordbog ovenfor. Redis forbeholder sig udtrykket nøgle for databasenøglen på øverste niveau, der indeholder selve hashstrukturen.

Ligesom der er MSET for grundlæggende string:string nøgleværdi-par, er der også HMSET for hashes til at sætte flere par indenfor objektet hashværdi:

127.0.0.1:6379> HMSET pypa url "https://www.pypa.io/" github pypa fulde navn "Python Packaging Authority"OK127.0.0.1:6379> HGETALL pypa1) "url"2 ) "https://www.pypa.io/"3) "github"4) "pypa"5) "fuldt navn"6) "Python Packaging Authority" 

Brug af HMSET er sandsynligvis en tættere parallel til den måde, vi tildelte data på til en indlejret ordbog ovenfor, i stedet for at indstille hvert indlejrede par, som det gøres med HSET .

To yderligere værdityper er lister og sæt , som kan træde i stedet for en hash eller streng som en Redis-værdi. De er stort set, hvad de lyder som, så jeg vil ikke tage dig tid med yderligere eksempler. Hashes, lister og sæt har hver nogle kommandoer, der er specifikke for den givne datatype, som i nogle tilfælde er angivet med deres begyndelsesbogstav:

  • Hashes: Kommandoer til at operere på hashes begynder med et H , såsom HSET , HGET eller HMSET .

  • Sæt: Kommandoer til at betjene sæt begynder med en S , såsom SCARD , som får antallet af elementer ved den indstillede værdi svarende til en given nøgle.

  • Lister: Kommandoer til at arbejde på lister begynder med et L eller R . Eksempler omfatter LPOP og RPUSH . L eller R refererer til hvilken side af listen der opereres. Nogle få listekommandoer er også indledt med en B , hvilket betyder blokering . En blokeringsoperation lader ikke andre operationer afbryde den, mens den udføres. For eksempel BLPOP udfører et blokerende venstre-pop på en listestruktur.

Bemærk: Et bemærkelsesværdigt træk ved Redis' listetype er, at det er en sammenkædet liste snarere end en matrix. Dette betyder, at tilføjelse er O(1), mens indeksering ved et vilkårligt indeksnummer er O(N).

Her er en hurtig liste over kommandoer, der er specifikke for streng, hash, liste og sæt datatyper i Redis:

Type Kommandoer
Sæt SADD , SCARD , SDIFF , SDIFFSTORE , SINTER , SINTERSTORE , SISMEMBER , MEMMERS , SMOVE , SPOP , SRANDMEMBER , SREM , SSCAN , SUNION , SUNIONSTORE
Hashes HDEL , HEKSISTERER , HGET , HGETALL , HINCRBY , HINCRBYFLOAT , HKEYS , HLEN , HMGET , HMSET , HSCAN , HSET , HSETNX , HSTRLEN , HVALS
Lister BLPOP , BRPOP , BRPOPLPUSH , LINDEX , LINSERT , LLEN , LPOP , LPUSH , LPUSHX , LRANGE , LREM , LSET , LTRIM , RPOP , RPOPLPUSH , RPUSH , RPUSHX
Strenge TILFØJ , BITCOUNT , BITFIELD , BITOP , BITPOS , DECR , DECRBY , GET , GETBIT , GETRANGE , GETSET , INCR , INCRBY , INCRBYFLOAT , MGET , MSET , MSETNX , PSETEX , SET , SETBIT , SETEX , SETNX , SETRANGE , STRLEN

Denne tabel er ikke et komplet billede af Redis-kommandoer og -typer. Der er et smørbord af mere avancerede datatyper, såsom geospatiale elementer, sorterede sæt og HyperLogLog. På siden Redis-kommandoer kan du filtrere efter datastrukturgruppe. Der er også en oversigt over datatyperne og en introduktion til Redis-datatyper.

Da vi går over til at gøre ting i Python, kan du nu rydde din legetøjsdatabase med FLUSHDB og afslut redis-cli REPL:

127.0.0.1:6379> FLUSHDBOK127.0.0.1:6379> AFSLUT 

Dette bringer dig tilbage til din shell-prompt. Du kan forlade redis-server kører i baggrunden, da du også skal bruge det til resten af ​​selvstudiet.




Brug af redis-py :Redis i Python

Nu hvor du har mestret nogle grundlæggende funktioner i Redis, er det tid til at hoppe ind i redis-py , Python-klienten, der lader dig tale med Redis fra en brugervenlig Python API.


Første trin

redis-py er et veletableret Python-klientbibliotek, der lader dig tale med en Redis-server direkte gennem Python-kald:

$ python -m pip install redis 

Dernæst skal du sørge for, at din Redis-server stadig er oppe og køre i baggrunden. Du kan tjekke med pgrep redis-server , og hvis du kommer op tomhændet, så genstart en lokal server med redis-server /etc/redis/6379.conf .

Lad os nu komme ind på den Python-centrerede del af tingene. Her er "hej verden" af redis-py :

>>>
 1>>> import redis 2>>> r =redis.Redis() 3>>> r.mset({"Croatia":"Zagreb", "Bahamas":"Nassau" }) 4True 5>>> r.get("Bahamas") 6b'Nassau' 

Redis , brugt i linje 2, er den centrale klasse af pakken og arbejdshesten, hvormed du udfører (næsten) enhver Redis-kommando. TCP-socket-forbindelsen og genbrugen udføres for dig bag kulisserne, og du kalder Redis-kommandoer ved hjælp af metoder på klasseinstansen r .

Bemærk også, at typen af ​​det returnerede objekt, b'Nassau' i linje 6, er Pythons bytes type, ikke str . Det er bytes i stedet for str det er den mest almindelige returtype på tværs af redis-py , så du skal muligvis ringe til r.get("Bahamas").decode("utf-8") afhængigt af, hvad du rent faktisk vil gøre med den returnerede bytesting.

Ser koden ovenfor bekendt ud? Metoderne matcher i næsten alle tilfælde navnet på Redis-kommandoen, der gør det samme. Her kaldte du r.mset() og r.get() , som svarer til MSET og GET i den oprindelige Redis API.

Dette betyder også, at HGETALL bliver r.hgetall() , PING bliver r.ping() , og så videre. Der er nogle få undtagelser, men reglen gælder for det store flertal af kommandoer.

Mens Redis-kommandoargumenterne normalt oversættes til en metodesignatur, der ligner lignende, tager de Python-objekter. For eksempel kaldet til r.mset() i eksemplet ovenfor bruger en Python dict som dets første argument, snarere end en sekvens af bytestrings.

Vi byggede Redis instans r uden argumenter, men den leveres med en række parametre, hvis du har brug for dem:

# Fra redis/client.pyclass Redis(object):def __init__(self, host='localhost', port=6379, db=0, password=Ingen, socket_timeout=Ingen, # ... 

Du kan se, at standard værtsnavn:port par er localhost:6379 , hvilket er præcis, hvad vi har brug for i tilfælde af vores lokalt opbevarede redis-server eksempel.

db parameter er databasenummeret. Du kan administrere flere databaser i Redis på én gang, og hver af dem er identificeret med et heltal. Det maksimale antal databaser er 16 som standard.

Når du kun kører redis-cli fra kommandolinjen, starter dette dig ved database 0. Brug -n flag for at starte en ny database, som i redis-cli -n 5 .



Tilladte nøgletyper

En ting, der er værd at vide, er at redis-py kræver, at du giver det nøgler, der er bytes , str , int , eller float . (Det vil konvertere de sidste 3 af disse typer til bytes før du sender dem til serveren.)

Overvej et tilfælde, hvor du vil bruge kalenderdatoer som nøgler:

>>>
>>> import datetime>>> today =datetime.date.today()>>> besøgende ={"dan", "jon", "alex"}>>> r.sadd (i dag, *besøgende)Sporing (seneste opkald sidst):# ...redis.exceptions.DataError:Ugyldig input af typen:'dato'. Konverter til en byte, streng eller tal først. 

Du skal eksplicit konvertere Python dato objekt til str , hvilket du kan gøre med .isoformat() :

>>>
>>> stoday =today.isoformat() # Python 3.7+, eller brug str(today)>>> stoday'2019-03-10'>>> r.sadd(stoday, *besøgende) # sadd:set-add3>>> r.smembers(stoday){b'dan', b'alex', b'jon'}>>> r.scard(today.isoformat())3 

For at opsummere tillader Redis selv kun strenge som tangenter. redis-py er en smule mere liberal med hensyn til, hvilke Python-typer den vil acceptere, selvom den i sidste ende konverterer alt til bytes, før den sender dem til en Redis-server.



Eksempel:PyHats.com

Det er på tide at udbrede et fyldigere eksempel. Lad os lade som om, vi har besluttet at starte et lukrativt websted, PyHats.com, der sælger uhyrligt dyre hatte til alle, der vil købe dem, og hyrede dig til at bygge webstedet.

Du skal bruge Redis til at håndtere noget af produktkataloget, opgørelsen og detektering af bottrafik for PyHats.com.

Det er dag ét for siden, og vi kommer til at sælge tre hatte i begrænset oplag. Hver hat bliver holdt i en Redis-hash af felt-værdi-par, og hashen har en nøgle, der er et tilfældigt heltal med præfiks, såsom hat:56854717 . Brug af hatten: præfikset er Redis-konventionen til at skabe en slags navneområde i en Redis-database:

import randomrandom.seed(444)hatte ={f"hat:{random.getrandbits(32)}":i for i in ( { "color":"sort", "price":49,99, "style":"fitted", "quantity":1000, "npurchased":0, }, { "color":"maroon", "price":59,99, "style":"hipster", "quantity":500 , "npurchased":0, }, { "color":"grøn", "price":99,99, "style":"baseball", "quantity":200, "npurchased":0, })} 

Lad os starte med databasen 1 da vi brugte databasen 0 i et tidligere eksempel:

>>>
>>> r =redis.Redis(db=1) 

For at lave en indledende skrivning af disse data til Redis, kan vi bruge .hmset() (hash multi-set), kalder det for hver ordbog. "Multi" er en reference til indstilling af flere felt-værdi-par, hvor "felt" i dette tilfælde svarer til en nøgle i enhver af de indlejrede ordbøger i hatte :

 1>>> med r.pipeline() som pipe:2... for h_id, hat i hats.items():3... pipe.hmset(h_id, hat) 4... pipe.execute() 5Pipeline>> 6Pipeline>> 7Pipeline>> 8[True, True, True] 910>>> r.bgsave()11True 

Kodeblokken ovenfor introducerer også konceptet Redis pipelining , som er en måde at skære ned på antallet af returtransaktioner, som du skal bruge for at skrive eller læse data fra din Redis-server. Hvis du bare ville have kaldt r.hmset() tre gange, så ville dette nødvendiggøre en frem og tilbage tur-retur-operation for hver skrevet række.

Med en pipeline bufres alle kommandoerne på klientsiden og sendes derefter på én gang, i ét hug, ved hjælp af pipe.hmset() i linje 3. Det er derfor de tre True svar returneres alle på én gang, når du kalder pipe.execute() i linje 4. Du vil snart se en mere avanceret use case for en pipeline.

Bemærk :Redis-dokumenterne giver et eksempel på at gøre det samme med redis-cli , hvor du kan overføre indholdet af en lokal fil til masseindsættelse.

Lad os gøre et hurtigt tjek, at alt er der i vores Redis-database:

>>>
>>> pprint(r.hgetall("hat:56854717")){b'color':b'green', b'npurchased':b'0', b'price' :b'99.99', b'quantity':b'200', b'style':b'baseball'}>>> r.keys() # Forsigtig på en stor DB. keys() er O(N)[b'56854717', b'1236154736', b'1326692461'] 

Den første ting, vi ønsker at simulere, er, hvad der sker, når en bruger klikker på Køb . Hvis varen er på lager, skal du øge dens npurchased med 1 og reducere dens mængde (beholdning) med 1. Du kan bruge .hincrby() for at gøre dette:

>>>
>>> r.hincrby("hat:56854717", "quantity", -1)199>>> r.hget("hat:56854717", "quantity")b'199 '>>> r.hincrby("hat:56854717", "npurchased", 1)1 

Bemærk :HINCRBY fungerer stadig på en hash-værdi, der er en streng, men den forsøger at fortolke strengen som et base-10 64-bit signeret heltal for at udføre operationen.

Dette gælder for andre kommandoer relateret til inkrementering og dekrementering for andre datastrukturer, nemlig INCR , INCRBY , INCRBYFLOAT , ZINCRBY , og HINCRBYFLOAT . Du får en fejl, hvis strengen ved værdien ikke kan repræsenteres som et heltal.

Det er dog ikke rigtig så enkelt. Ændring af mængde og npurchased i to linjer kode gemmer sig den virkelighed, at et klik, køb og betaling medfører mere end dette. Vi er nødt til at foretage nogle flere kontroller for at sikre, at vi ikke efterlader nogen med en lettere pung og ingen hat:

  • Trin 1: Tjek, om varen er på lager, eller rejs på anden måde en undtagelse i backend.
  • Trin 2: If it is in stock, then execute the transaction, decrease the quantity field, and increase the npurchased felt.
  • Step 3: Be alert for any changes that alter the inventory in between the first two steps (a race condition).

Step 1 is relatively straightforward:it consists of an .hget() to check the available quantity.

Step 2 is a little bit more involved. The pair of increase and decrease operations need to be executed atomically :either both should be completed successfully, or neither should be (in the case that at least one fails).

With client-server frameworks, it’s always crucial to pay attention to atomicity and look out for what could go wrong in instances where multiple clients are trying to talk to the server at once. The answer to this in Redis is to use a transaction block, meaning that either both or neither of the commands get through.

In redis-py , Pipeline is a transactional pipeline class by default. This means that, even though the class is actually named for something else (pipelining), it can be used to create a transaction block also.

In Redis, a transaction starts with MULTI and ends with EXEC :

 1127.0.0.1:6379> MULTI 2127.0.0.1:6379> HINCRBY 56854717 quantity -1 3127.0.0.1:6379> HINCRBY 56854717 npurchased 1 4127.0.0.1:6379> EXEC 

MULTI (Line 1) marks the start of the transaction, and EXEC (Line 4) marks the end. Everything in between is executed as one all-or-nothing buffered sequence of commands. This means that it will be impossible to decrement quantity (Line 2) but then have the balancing npurchased increment operation fail (Line 3).

Let’s circle back to Step 3:we need to be aware of any changes that alter the inventory in between the first two steps.

Step 3 is the trickiest. Let’s say that there is one lone hat remaining in our inventory. In between the time that User A checks the quantity of hats remaining and actually processes their transaction, User B also checks the inventory and finds likewise that there is one hat listed in stock. Both users will be allowed to purchase the hat, but we have 1 hat to sell, not 2, so we’re on the hook and one user is out of their money. Not good.

Redis has a clever answer for the dilemma in Step 3:it’s called optimistic locking , and is different than how typical locking works in an RDBMS such as PostgreSQL. Optimistic locking, in a nutshell, means that the calling function (client) does not acquire a lock, but rather monitors for changes in the data it is writing to during the time it would have held a lock . If there’s a conflict during that time, the calling function simply tries the whole process again.

You can effect optimistic locking by using the WATCH command (.watch() in redis-py ), which provides a check-and-set behavior.

Let’s introduce a big chunk of code and walk through it afterwards step by step. You can picture buyitem() as being called any time a user clicks on a Buy Now or Purchase button. Its purpose is to confirm the item is in stock and take an action based on that result, all in a safe manner that looks out for race conditions and retries if one is detected:

 1import logging 2import redis 3 4logging.basicConfig() 5 6class OutOfStockError(Exception):7 """Raised when PyHats.com is all out of today's hottest hat""" 8 9def buyitem(r:redis.Redis, itemid:int) -> None:10 with r.pipeline() as pipe:11 error_count =012 while True:13 try:14 # Get available inventory, watching for changes15 # related to this itemid before the transaction16 pipe.watch(itemid)17 nleft:bytes =r.hget(itemid, "quantity")18 if nleft> b"0":19 pipe.multi()20 pipe.hincrby(itemid, "quantity", -1)21 pipe.hincrby(itemid, "npurchased", 1)22 pipe.execute()23 break24 else:25 # Stop watching the itemid and raise to break out26 pipe.unwatch()27 raise OutOfStockError(28 f"Sorry, {itemid} is out of stock!"29 )30 except redis.WatchError:31 # Log total num. of errors by this user to buy this item,32 # then try the same process again of WATCH/HGET/MULTI/EXEC33 error_count +=134 logging.warning(35 "WatchError #%d:%s; retrying",36 error_count, itemid37 )38 return None 

The critical line occurs at Line 16 with pipe.watch(itemid) , which tells Redis to monitor the given itemid for any changes to its value. The program checks the inventory through the call to r.hget(itemid, "quantity") , in Line 17:

16pipe.watch(itemid)17nleft:bytes =r.hget(itemid, "quantity")18if nleft> b"0":19 # Item in stock. Proceed with transaction. 

If the inventory gets touched during this short window between when the user checks the item stock and tries to purchase it, then Redis will return an error, and redis-py will raise a WatchError (Line 30). That is, if any of the hash pointed to by itemid changes after the .hget() call but before the subsequent .hincrby() calls in Lines 20 and 21, then we’ll re-run the whole process in another iteration of the while True loop as a result.

This is the “optimistic” part of the locking:rather than letting the client have a time-consuming total lock on the database through the getting and setting operations, we leave it up to Redis to notify the client and user only in the case that calls for a retry of the inventory check.

One key here is in understanding the difference between client-side and server-side operations:

nleft =r.hget(itemid, "quantity") 

This Python assignment brings the result of r.hget() client-side. Conversely, methods that you call on pipe effectively buffer all of the commands into one, and then send them to the server in a single request:

16pipe.multi()17pipe.hincrby(itemid, "quantity", -1)18pipe.hincrby(itemid, "npurchased", 1)19pipe.execute() 

No data comes back to the client side in the middle of the transactional pipeline. You need to call .execute() (Line 19) to get the sequence of results back all at once.

Even though this block contains two commands, it consists of exactly one round-trip operation from client to server and back.

This means that the client can’t immediately use the result of pipe.hincrby(itemid, "quantity", -1) , from Line 20, because methods on a Pipeline return just the pipe instance itself. We haven’t asked anything from the server at this point. While normally .hincrby() returns the resulting value, you can’t immediately reference it on the client side until the entire transaction is completed.

There’s a catch-22:this is also why you can’t put the call to .hget() into the transaction block. If you did this, then you’d be unable to know if you want to increment the npurchased field yet, since you can’t get real-time results from commands that are inserted into a transactional pipeline.

Finally, if the inventory sits at zero, then we UNWATCH the item ID and raise an OutOfStockError (Line 27), ultimately displaying that coveted Sold Out page that will make our hat buyers desperately want to buy even more of our hats at ever more outlandish prices:

24else:25 # Stop watching the itemid and raise to break out26 pipe.unwatch()27 raise OutOfStockError(28 f"Sorry, {itemid} is out of stock!"29 ) 

Here’s an illustration. Keep in mind that our starting quantity is 199 for hat 56854717 since we called .hincrby() over. Let’s mimic 3 purchases, which should modify the quantity and npurchased felter:

>>>
>>> buyitem(r, "hat:56854717")>>> buyitem(r, "hat:56854717")>>> buyitem(r, "hat:56854717")>>> r.hmget("hat:56854717", "quantity", "npurchased") # Hash multi-get[b'196', b'4'] 

Now, we can fast-forward through more purchases, mimicking a stream of purchases until the stock depletes to zero. Again, picture these coming from a whole bunch of different clients rather than just one Redis instance:

>>>
>>> # Buy remaining 196 hats for item 56854717 and deplete stock to 0>>> for _ in range(196):... buyitem(r, "hat:56854717")>>> r.hmget("hat:56854717", "quantity", "npurchased")[b'0', b'200'] 

Now, when some poor user is late to the game, they should be met with an OutOfStockError that tells our application to render an error message page on the frontend:

>>>
>>> buyitem(r, "hat:56854717")Traceback (most recent call last):File "", line 1, in  File "", line 20, in buyitem__main__.OutOfStockError:Sorry, hat:56854717 is out of stock! 

Looks like it’s time to restock.



Using Key Expiry

Let’s introduce key expiry , which is another distinguishing feature in Redis. When you expire a key, that key and its corresponding value will be automatically deleted from the database after a certain number of seconds or at a certain timestamp.

In redis-py , one way that you can accomplish this is through .setex() , which lets you set a basic string:string key-value pair with an expiration:

>>>
 1>>> from datetime import timedelta 2 3>>> # setex:"SET" with expiration 4>>> r.setex( 5... "runner", 6... timedelta(minutes=1), 7... value="now you see me, now you don't" 8... ) 9True 

You can specify the second argument as a number in seconds or a timedelta object, as in Line 6 above. I like the latter because it seems less ambiguous and more deliberate.

There are also methods (and corresponding Redis commands, of course) to get the remaining lifetime (time-to-live ) of a key that you’ve set to expire:

>>>
>>> r.ttl("runner") # "Time To Live", in seconds58>>> r.pttl("runner") # Like ttl, but milliseconds54368 

Below, you can accelerate the window until expiration, and then watch the key expire, after which r.get() will return None and .exists() will return 0 :

>>>
>>> r.get("runner") # Not expired yetb"now you see me, now you don't">>> r.expire("runner", timedelta(seconds=3)) # Set new expire windowTrue>>> # Pause for a few seconds>>> r.get("runner")>>> r.exists("runner") # Key &value are both gone (expired)0 

The table below summarizes commands related to key-value expiration, including the ones covered above. The explanations are taken directly from redis-py method docstrings:

Signature Purpose
r.setex(name, time, value) Sets the value of key name to value that expires in time seconds, where time can be represented by an int or a Python timedelta object
r.psetex(name, time_ms, value) Sets the value of key name to value that expires in time_ms milliseconds, where time_ms can be represented by an int or a Python timedelta object
r.expire(name, time) Sets an expire flag on key name for time seconds, where time can be represented by an int or a Python timedelta object
r.expireat(name, when) Sets an expire flag on key name , where when can be represented as an int indicating Unix time or a Python datetime object
r.persist(name) Removes an expiration on name
r.pexpire(name, time) Sets an expire flag on key name for time milliseconds, and time can be represented by an int or a Python timedelta object
r.pexpireat(name, when) Sets an expire flag on key name , where when can be represented as an int representing Unix time in milliseconds (Unix time * 1000) or a Python datetime object
r.pttl(name) Returns the number of milliseconds until the key name will expire
r.ttl(name) Returns the number of seconds until the key name will expire


PyHats.com, Part 2

A few days after its debut, PyHats.com has attracted so much hype that some enterprising users are creating bots to buy hundreds of items within seconds, which you’ve decided isn’t good for the long-term health of your hat business.

Now that you’ve seen how to expire keys, let’s put it to use on the backend of PyHats.com.

We’re going to create a new Redis client that acts as a consumer (or watcher) and processes a stream of incoming IP addresses, which in turn may come from multiple HTTPS connections to the website’s server.

The watcher’s goal is to monitor a stream of IP addresses from multiple sources, keeping an eye out for a flood of requests from a single address within a suspiciously short amount of time.

Some middleware on the website server pushes all incoming IP addresses into a Redis list with .lpush() . Here’s a crude way of mimicking some incoming IPs, using a fresh Redis database:

>>>
>>> r =redis.Redis(db=5)>>> r.lpush("ips", "51.218.112.236")1>>> r.lpush("ips", "90.213.45.98")2>>> r.lpush("ips", "115.215.230.176")3>>> r.lpush("ips", "51.218.112.236")4 

As you can see, .lpush() returns the length of the list after the push operation succeeds. Each call of .lpush() puts the IP at the beginning of the Redis list that is keyed by the string "ips" .

In this simplified simulation, the requests are all technically from the same client, but you can think of them as potentially coming from many different clients and all being pushed to the same database on the same Redis server.

Now, open up a new shell tab or window and launch a new Python REPL. In this shell, you’ll create a new client that serves a very different purpose than the rest, which sits in an endless while True loop and does a blocking left-pop BLPOP call on the ips list, processing each address:

 1# New shell window or tab 2 3import datetime 4import ipaddress 5 6import redis 7 8# Where we put all the bad egg IP addresses 9blacklist =set()10MAXVISITS =151112ipwatcher =redis.Redis(db=5)1314while True:15 _, addr =ipwatcher.blpop("ips")16 addr =ipaddress.ip_address(addr.decode("utf-8"))17 now =datetime.datetime.utcnow()18 addrts =f"{addr}:{now.minute}"19 n =ipwatcher.incrby(addrts, 1)20 if n>=MAXVISITS:21 print(f"Hat bot detected!:{addr}")22 blacklist.add(addr)23 else:24 print(f"{now}:saw {addr}")25 _ =ipwatcher.expire(addrts, 60) 

Let’s walk through a few important concepts.

The ipwatcher acts like a consumer, sitting around and waiting for new IPs to be pushed on the "ips" Redis list. It receives them as bytes , such as b”51.218.112.236”, and makes them into a more proper address object with the ipaddress module:

15_, addr =ipwatcher.blpop("ips")16addr =ipaddress.ip_address(addr.decode("utf-8")) 

Then you form a Redis string key using the address and minute of the hour at which the ipwatcher saw the address, incrementing the corresponding count by 1 and getting the new count in the process:

17now =datetime.datetime.utcnow()18addrts =f"{addr}:{now.minute}"19n =ipwatcher.incrby(addrts, 1) 

If the address has been seen more than MAXVISITS , then it looks as if we have a PyHats.com web scraper on our hands trying to create the next tulip bubble. Alas, we have no choice but to give this user back something like a dreaded 403 status code.

We use ipwatcher.expire(addrts, 60) to expire the (address minute) combination 60 seconds from when it was last seen. This is to prevent our database from becoming clogged up with stale one-time page viewers.

If you execute this code block in a new shell, you should immediately see this output:

2019-03-11 15:10:41.489214:saw 51.218.112.2362019-03-11 15:10:41.490298:saw 115.215.230.1762019-03-11 15:10:41.490839:saw 90.213.45.982019-03-11 15:10:41.491387:saw 51.218.112.236 

The output appears right away because those four IPs were sitting in the queue-like list keyed by "ips" , waiting to be pulled out by our ipwatcher . Using .blpop() (or the BLPOP command) will block until an item is available in the list, then pops it off. It behaves like Python’s Queue.get() , which also blocks until an item is available.

Besides just spitting out IP addresses, our ipwatcher has a second job. For a given minute of an hour (minute 1 through minute 60), ipwatcher will classify an IP address as a hat-bot if it sends 15 or more GET requests in that minute.

Switch back to your first shell and mimic a page scraper that blasts the site with 20 requests in a few milliseconds:

for _ in range(20):r.lpush("ips", "104.174.118.18") 

Finally, toggle back to the second shell holding ipwatcher , and you should see an output like this:

2019-03-11 15:15:43.041363:saw 104.174.118.182019-03-11 15:15:43.042027:saw 104.174.118.182019-03-11 15:15:43.042598:saw 104.174.118.182019-03-11 15:15:43.043143:saw 104.174.118.182019-03-11 15:15:43.043725:saw 104.174.118.182019-03-11 15:15:43.044244:saw 104.174.118.182019-03-11 15:15:43.044760:saw 104.174.118.182019-03-11 15:15:43.045288:saw 104.174.118.182019-03-11 15:15:43.045806:saw 104.174.118.182019-03-11 15:15:43.046318:saw 104.174.118.182019-03-11 15:15:43.046829:saw 104.174.118.182019-03-11 15:15:43.047392:saw 104.174.118.182019-03-11 15:15:43.047966:saw 104.174.118.182019-03-11 15:15:43.048479:saw 104.174.118.18Hat bot detected!:104.174.118.18Hat bot detected!:104.174.118.18Hat bot detected!:104.174.118.18Hat bot detected!:104.174.118.18Hat bot detected!:104.174.118.18Hat bot detected!:104.174.118.18 

Now, Ctrl +C out of the while True loop and you’ll see that the offending IP has been added to your blacklist:

>>>
>>> blacklist{IPv4Address('104.174.118.18')} 

Can you find the defect in this detection system? The filter checks the minute as .minute rather than the last 60 seconds (a rolling minute). Implementing a rolling check to monitor how many times a user has been seen in the last 60 seconds would be trickier. There’s a crafty solution using using Redis’ sorted sets at ClassDojo. Josiah Carlson’s Redis in Action also presents a more elaborate and general-purpose example of this section using an IP-to-location cache table.



Persistence and Snapshotting

One of the reasons that Redis is so fast in both read and write operations is that the database is held in memory (RAM) on the server. However, a Redis database can also be stored (persisted) to disk in a process called snapshotting. The point behind this is to keep a physical backup in binary format so that data can be reconstructed and put back into memory when needed, such as at server startup.

You already enabled snapshotting without knowing it when you set up basic configuration at the beginning of this tutorial with the save mulighed:

# /etc/redis/6379.confport 6379daemonize yessave 60 1bind 127.0.0.1tcp-keepalive 300dbfilename dump.rdbdir ./rdbcompression yes 

The format is save . This tells Redis to save the database to disk if both the given number of seconds and number of write operations against the database occurred. In this case, we’re telling Redis to save the database to disk every 60 seconds if at least one modifying write operation occurred in that 60-second timespan. This is a fairly aggressive setting versus the sample Redis config file, which uses these three save directives:

# Default redis/redis.confsave 900 1save 300 10save 60 10000 

An RDB snapshot is a full (rather than incremental) point-in-time capture of the database. (RDB refers to a Redis Database File.) We also specified the directory and file name of the resulting data file that gets written:

# /etc/redis/6379.confport 6379daemonize yessave 60 1bind 127.0.0.1tcp-keepalive 300dbfilename dump.rdbdir ./rdbcompression yes 

This instructs Redis to save to a binary data file called dump.rdb in the current working directory of wherever redis-server was executed from:

$ file -b dump.rdbdata 

You can also manually invoke a save with the Redis command BGSAVE :

127.0.0.1:6379> BGSAVEBackground saving started 

The “BG” in BGSAVE indicates that the save occurs in the background. This option is available in a redis-py method as well:

>>>
>>> r.lastsave() # Redis command:LASTSAVEdatetime.datetime(2019, 3, 10, 21, 56, 50)>>> r.bgsave()True>>> r.lastsave()datetime.datetime(2019, 3, 10, 22, 4, 2) 

This example introduces another new command and method, .lastsave() . In Redis, it returns the Unix timestamp of the last DB save, which Python gives back to you as a datetime objekt. Above, you can see that the r.lastsave() result changes as a result of r.bgsave() .

r.lastsave() will also change if you enable automatic snapshotting with the save configuration option.

To rephrase all of this, there are two ways to enable snapshotting:

  1. Explicitly, through the Redis command BGSAVE or redis-py method .bgsave()
  2. Implicitly, through the save configuration option (which you can also set with .config_set() in redis-py )

RDB snapshotting is fast because the parent process uses the fork() system call to pass off the time-intensive write to disk to a child process, so that the parent process can continue on its way. This is what the background in BGSAVE refers to.

There’s also SAVE (.save() in redis-py ), but this does a synchronous (blocking) save rather than using fork() , so you shouldn’t use it without a specific reason.

Even though .bgsave() occurs in the background, it’s not without its costs. The time for fork() itself to occur can actually be substantial if the Redis database is large enough in the first place.

If this is a concern, or if you can’t afford to miss even a tiny slice of data lost due to the periodic nature of RDB snapshotting, then you should look into the append-only file (AOF) strategy that is an alternative to snapshotting. AOF copies Redis commands to disk in real time, allowing you to do a literal command-based reconstruction by replaying these commands.



Serialization Workarounds

Let’s get back to talking about Redis data structures. With its hash data structure, Redis in effect supports nesting one level deep:

127.0.0.1:6379> hset mykey field1 value1 

The Python client equivalent would look like this:

r.hset("mykey", "field1", "value1") 

Here, you can think of "field1":"value1" as being the key-value pair of a Python dict, {"field1":"value1"} , while mykey is the top-level key:

Redis Command Pure-Python Equivalent
r.set("key", "value") r ={"key":"value"}
r.hset("key", "field", "value") r ={"key":{"field":"value"}}

But what if you want the value of this dictionary (the Redis hash) to contain something other than a string, such as a list or nested dictionary with strings as values?

Here’s an example using some JSON-like data to make the distinction clearer:

restaurant_484272 ={ "name":"Ravagh", "type":"Persian", "address":{ "street":{ "line1":"11 E 30th St", "line2":"APT 1", }, "city":"New York", "state":"NY", "zip":10016, }} 

Say that we want to set a Redis hash with the key 484272 and field-value pairs corresponding to the key-value pairs from restaurant_484272 . Redis does not support this directly, because restaurant_484272 is nested:

>>>
>>> r.hmset(484272, restaurant_484272)Traceback (most recent call last):# ...redis.exceptions.DataError:Invalid input of type:'dict'.Convert to a byte, string or number first. 

You can in fact make this work with Redis. There are two different ways to mimic nested data in redis-py and Redis:

  1. Serialize the values into a string with something like json.dumps()
  2. Use a delimiter in the key strings to mimic nesting in the values

Let’s take a look at an example of each.

Option 1:Serialize the Values Into a String

You can use json.dumps() to serialize the dict into a JSON-formatted string:

>>>
>>> import json>>> r.set(484272, json.dumps(restaurant_484272))True 

If you call .get() , the value you get back will be a bytes object, so don’t forget to deserialize it to get back the original object. json.dumps() and json.loads() are inverses of each other, for serializing and deserializing data, respectively:

>>>
>>> from pprint import pprint>>> pprint(json.loads(r.get(484272))){'address':{'city':'New York', 'state':'NY', 'street':'11 E 30th St', 'zip':10016}, 'name':'Ravagh', 'type':'Persian'} 

This applies to any serialization protocol, with another common choice being yaml :

>>>
>>> import yaml # python -m pip install PyYAML>>> yaml.dump(restaurant_484272)'address:{city:New York, state:NY, street:11 E 30th St, zip:10016}\nname:Ravagh\ntype:Persian\n' 

No matter what serialization protocol you choose to go with, the concept is the same:you’re taking an object that is unique to Python and converting it to a bytestring that is recognized and exchangeable across multiple languages.

Option 2:Use a Delimiter in Key Strings

There’s a second option that involves mimicking “nestedness” by concatenating multiple levels of keys in a Python dict . This consists of flattening the nested dictionary through recursion, so that each key is a concatenated string of keys, and the values are the deepest-nested values from the original dictionary. Consider our dictionary object restaurant_484272 :

restaurant_484272 ={ "name":"Ravagh", "type":"Persian", "address":{ "street":{ "line1":"11 E 30th St", "line2":"APT 1", }, "city":"New York", "state":"NY", "zip":10016, }} 

We want to get it into this form:

{ "484272:name":"Ravagh", "484272:type":"Persian", "484272:address:street:line1":"11 E 30th St", "484272:address:street:line2":"APT 1", "484272:address:city":"New York", "484272:address:state":"NY", "484272:address:zip":"10016",} 

That’s what setflat_skeys() below does, with the added feature that it does inplace .set() operations on the Redis instance itself rather than returning a copy of the input dictionary:

 1from collections.abc import MutableMapping 2 3def setflat_skeys( 4 r:redis.Redis, 5 obj:dict, 6 prefix:str, 7 delim:str =":", 8 *, 9 _autopfix=""10) -> None:11 """Flatten `obj` and set resulting field-value pairs into `r`.1213 Calls `.set()` to write to Redis instance inplace and returns None.1415 `prefix` is an optional str that prefixes all keys.16 `delim` is the delimiter that separates the joined, flattened keys.17 `_autopfix` is used in recursive calls to created de-nested keys.1819 The deepest-nested keys must be str, bytes, float, or int.20 Otherwise a TypeError is raised.21 """22 allowed_vtypes =(str, bytes, float, int)23 for key, value in obj.items():24 key =_autopfix + key25 if isinstance(value, allowed_vtypes):26 r.set(f"{prefix}{delim}{key}", value)27 elif isinstance(value, MutableMapping):28 setflat_skeys(29 r, value, prefix, delim, _autopfix=f"{key}{delim}"30 )31 else:32 raise TypeError(f"Unsupported value type:{type(value)}") 

The function iterates over the key-value pairs of obj , first checking the type of the value (Line 25) to see if it looks like it should stop recursing further and set that key-value pair. Otherwise, if the value looks like a dict (Line 27), then it recurses into that mapping, adding the previously seen keys as a key prefix (Line 28).

Let’s see it at work:

>>>
>>> r.flushdb() # Flush database:clear old entries>>> setflat_skeys(r, restaurant_484272, 484272)>>> for key in sorted(r.keys("484272*")):# Filter to this pattern... print(f"{repr(key):35}{repr(r.get(key)):15}")...b'484272:address:city' b'New York'b'484272:address:state' b'NY'b'484272:address:street:line1' b'11 E 30th St'b'484272:address:street:line2' b'APT 1'b'484272:address:zip' b'10016'b'484272:name' b'Ravagh'b'484272:type' b'Persian'>>> r.get("484272:address:street:line1")b'11 E 30th St' 

The final loop above uses r.keys("484272*") , where "484272*" is interpreted as a pattern and matches all keys in the database that begin with "484272" .

Notice also how setflat_skeys() calls just .set() rather than .hset() , because we’re working with plain string:string field-value pairs, and the 484272 ID key is prepended to each field string.



Encryption

Another trick to help you sleep well at night is to add symmetric encryption before sending anything to a Redis server. Consider this as an add-on to the security that you should make sure is in place by setting proper values in your Redis configuration. The example below uses the cryptography package:

$ python -m pip install cryptography 

To illustrate, pretend that you have some sensitive cardholder data (CD) that you never want to have sitting around in plaintext on any server, no matter what. Before caching it in Redis, you can serialize the data and then encrypt the serialized string using Fernet:

>>>
>>> import json>>> from cryptography.fernet import Fernet>>> cipher =Fernet(Fernet.generate_key())>>> info ={... "cardnum":2211849528391929,... "exp":[2020, 9],... "cv2":842,... }>>> r.set(... "user:1000",... cipher.encrypt(json.dumps(info).encode("utf-8"))... )>>> r.get("user:1000")b'gAAAAABcg8-LfQw9TeFZ1eXbi' # ... [truncated]>>> cipher.decrypt(r.get("user:1000"))b'{"cardnum":2211849528391929, "exp":[2020, 9], "cv2":842}'>>> json.loads(cipher.decrypt(r.get("user:1000"))){'cardnum':2211849528391929, 'exp':[2020, 9], 'cv2':842} 

Because info contains a value that is a list , you’ll need to serialize this into a string that’s acceptable by Redis. (You could use json , yaml , or any other serialization for this.) Next, you encrypt and decrypt that string using the cipher objekt. You need to deserialize the decrypted bytes using json.loads() so that you can get the result back into the type of your initial input, a dict .

Note :Fernet uses AES 128 encryption in CBC mode. See the cryptography docs for an example of using AES 256. Whatever you choose to do, use cryptography , not pycrypto (imported as Crypto ), which is no longer actively maintained.

If security is paramount, encrypting strings before they make their way across a network connection is never a bad idea.



Compression

One last quick optimization is compression. If bandwidth is a concern or you’re cost-conscious, you can implement a lossless compression and decompression scheme when you send and receive data from Redis. Here’s an example using the bzip2 compression algorithm, which in this extreme case cuts down on the number of bytes sent across the connection by a factor of over 2,000:

>>>
 1>>> import bz2 2 3>>> blob ="i have a lot to talk about" * 10000 4>>> len(blob.encode("utf-8")) 5260000 6 7>>> # Set the compressed string as value 8>>> r.set("msg:500", bz2.compress(blob.encode("utf-8"))) 9>>> r.get("msg:500")10b'BZh91AY&SY\xdaM\x1eu\x01\x11o\x91\x80@\x002l\x87\' # ... [truncated]11>>> len(r.get("msg:500"))1212213>>> 260_000 / 122 # Magnitude of savings142131.14754098360661516>>> # Get and decompress the value, then confirm it's equal to the original17>>> rblob =bz2.decompress(r.get("msg:500")).decode("utf-8")18>>> rblob ==blob19True 

The way that serialization, encryption, and compression are related here is that they all occur client-side. You do some operation on the original object on the client-side that ends up making more efficient use of Redis once you send the string over to the server. The inverse operation then happens again on the client side when you request whatever it was that you sent to the server in the first place.




Using Hiredis

It’s common for a client library such as redis-py to follow a protocol in how it is built. In this case, redis-py implements the REdis Serialization Protocol, or RESP.

Part of fulfilling this protocol consists of converting some Python object in a raw bytestring, sending it to the Redis server, and parsing the response back into an intelligible Python object.

For example, the string response “OK” would come back as "+OK\r\n" , while the integer response 1000 would come back as ":1000\r\n" . This can get more complex with other data types such as RESP arrays.

A parser is a tool in the request-response cycle that interprets this raw response and crafts it into something recognizable to the client. redis-py ships with its own parser class, PythonParser , which does the parsing in pure Python. (See .read_response() if you’re curious.)

However, there’s also a C library, Hiredis, that contains a fast parser that can offer significant speedups for some Redis commands such as LRANGE . You can think of Hiredis as an optional accelerator that it doesn’t hurt to have around in niche cases.

All that you have to do to enable redis-py to use the Hiredis parser is to install its Python bindings in the same environment as redis-py :

$ python -m pip install hiredis 

What you’re actually installing here is hiredis-py , which is a Python wrapper for a portion of the hiredis C library.

The nice thing is that you don’t really need to call hiredis yourself. Just pip install it, and this will let redis-py see that it’s available and use its HiredisParser instead of PythonParser .

Internally, redis-py will attempt to import hiredis , and use a HiredisParser class to match it, but will fall back to its PythonParser instead, which may be slower in some cases:

# redis/utils.pytry:import hiredis HIREDIS_AVAILABLE =Trueexcept ImportError:HIREDIS_AVAILABLE =False# redis/connection.pyif HIREDIS_AVAILABLE:DefaultParser =HiredisParserelse:DefaultParser =PythonParser 


Using Enterprise Redis Applications

While Redis itself is open-source and free, several managed services have sprung up that offer a data store with Redis as the core and some additional features built on top of the open-source Redis server:

  • Amazon ElastiCache for Redis : This is a web service that lets you host a Redis server in the cloud, which you can connect to from an Amazon EC2 instance. For full setup instructions, you can walk through Amazon’s ElastiCache for Redis launch page.

  • Microsoft’s Azure Cache for Redis : This is another capable enterprise-grade service that lets you set up a customizable, secure Redis instance in the cloud.

The designs of the two have some commonalities. You typically specify a custom name for your cache, which is embedded as part of a DNS name, such as demo.abcdef.xz.0009.use1.cache.amazonaws.com (AWS) or demo.redis.cache.windows.net (Azure).

Once you’re set up, here are a few quick tips on how to connect.

From the command line, it’s largely the same as in our earlier examples, but you’ll need to specify a host with the h flag rather than using the default localhost. For Amazon AWS , execute the following from your instance shell:

$ export REDIS_ENDPOINT="demo.abcdef.xz.0009.use1.cache.amazonaws.com"$ redis-cli -h $REDIS_ENDPOINT 

For Microsoft Azure , you can use a similar call. Azure Cache for Redis uses SSL (port 6380) by default rather than port 6379, allowing for encrypted communication to and from Redis, which can’t be said of TCP. All that you’ll need to supply in addition is a non-default port and access key:

$ export REDIS_ENDPOINT="demo.redis.cache.windows.net"$ redis-cli -h $REDIS_ENDPOINT -p 6380 -a  

The -h flag specifies a host, which as you’ve seen is 127.0.0.1 (localhost) by default.

When you’re using redis-py in Python, it’s always a good idea to keep sensitive variables out of Python scripts themselves, and to be careful about what read and write permissions you afford those files. The Python version would look like this:

>>>
>>> import os>>> import redis>>> # Specify a DNS endpoint instead of the default localhost>>> os.environ["REDIS_ENDPOINT"]'demo.abcdef.xz.0009.use1.cache.amazonaws.com'>>> r =redis.Redis(host=os.environ["REDIS_ENDPOINT"]) 

That’s all there is to it. Besides specifying a different host , you can now call command-related methods such as r.get() as normal.

Note :If you want to use solely the combination of redis-py and an AWS or Azure Redis instance, then you don’t really need to install and make Redis itself locally on your machine, since you don’t need either redis-cli or redis-server .

If you’re deploying a medium- to large-scale production application where Redis plays a key role, then going with AWS or Azure’s service solutions can be a scalable, cost-effective, and security-conscious way to operate.



Wrapping Up

That concludes our whirlwind tour of accessing Redis through Python, including installing and using the Redis REPL connected to a Redis server and using redis-py in real-life examples. Here’s some of what you learned:

  • redis-py lets you do (almost) everything that you can do with the Redis CLI through an intuitive Python API.
  • Mastering topics such as persistence, serialization, encryption, and compression lets you use Redis to its full potential.
  • Redis transactions and pipelines are essential parts of the library in more complex situations.
  • Enterprise-level Redis services can help you smoothly use Redis in production.

Redis has an extensive set of features, some of which we didn’t really get to cover here, including server-side Lua scripting, sharding, and master-slave replication. If you think that Redis is up your alley, then make sure to follow developments as it implements an updated protocol, RESP3.



Further Reading

Here are some resources that you can check out to learn more.

Books:

  • Josiah Carlson: Redis in Action
  • Karl Seguin: The Little Redis Book
  • Luc Perkins et. al.: Seven Databases in Seven Weeks

Redis in use:

  • Twitter: Real-Time Delivery Architecture at Twitter
  • Spool: Redis bitmaps – Fast, easy, realtime metrics
  • 3scale: Having fun with Redis Replication between Amazon and Rackspace
  • Instagram: Storing hundreds of millions of simple key-value pairs in Redis
  • Craigslist: Redis Sharding at Craigslist
  • Disqus: Redis at Disqus

Other:

  • Digital Ocean: How To Secure Your Redis Installation
  • AWS: ElastiCache for Redis User Guide
  • Microsoft: Azure Cache for Redis
  • Cheatography: Redis Cheat Sheet
  • ClassDojo: Better Rate Limiting With Redis Sorted Sets
  • antirez (Salvatore Sanfilippo): Redis persistence demystified
  • Martin Kleppmann: How to do distributed locking
  • HighScalability: 11 Common Web Use Cases Solved in Redis


  1. Hvordan udløber redis nøgler?

  2. Mongo Query spørgsmål $gt,$lt

  3. Skift type felt inde i mongoDB aggregering, og bruger $lookup indeks på felter eller ej?

  4. Forespørgsel MongoDB med længdekriterier