#python - kbin.social

treyhunner, 7 minutes ago to python

The bisect module has an implementation of binary search for you.

Read the full article: Python Big O: the time complexities of different data structures in Python
▸ https://trey.io/d8D57O

#Python

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

pyOpenSci, 4 hours ago to random

pip install xgi

And get started streamlining your processes for working with higher-order networks from start to finish! XGI is part of the #pyOpenSci ecosystem, and excels at many things, including:

🔍 Analyzing higher-order networks with measures and algorithms
🧰 Manipulating node and edge statistics in a flexible and customizable way
🎨 Drawing higher-order networks in a variety of visually striking ways

📄 XGI docs: https://xgi.readthedocs.io/en/stable/

#opensource #openscience #python #pythonpackage

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

talkpython, 4 hours ago to python

Kicking off another @talkpython live stream in a few minutes! Join us and be part of the show with @mkennedy and guests. #python #podcast https://talkpython.fm/stream/live

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

treyhunner, 5 hours ago to python

What are your favorite #Python one liners?

reply

expand (9)

collapse (9)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ elduvelle

talkpython, 6 hours ago to python

Kicking off another @talkpython live stream in a few minutes! Join us and be part of the show with @mkennedy and guests. #python #podcast https://talkpython.fm/stream/live

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Posit, 8 hours ago to python

RStudio IDE and Posit Workbench version 2024.04.0, code-named “Chocolate Cosmos, is now out! 🎉

The latest release comes with several updates, such as bundling Quarto version 1.4, VS Code updates, and support for R 4.4.

Learn more: https://posit.co/blog/rstudio-2024-04-0-whats-new/

#RStats #Python #Quarto #QuartoPub

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chiefgyk3d, 10 hours ago to python

I was up late trying to figure out a stupid issue I was having with the Crowdstrike API so I didn't stream on twitch last night, hoping to do a stream tonight. I think they took a feature out my team was actually using which would allow me to contain a device and make a note that could be viewed in the dashboard.

#Coding #Crowdstrike #Python #Dev #InfoSec #Cybersecurity

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

joe, 12 hours ago (edited 10 hours ago) to ai

Back in January, we started looking at AI and how to run a large language model (LLM) locally (instead of just using something like ChatGPT or Gemini). A tool like Ollama is great for building a system that uses AI without dependence on OpenAI. Today, we will look at creating a Retrieval-augmented generation (RAG) application, using Python, LangChain, Chroma DB, and Ollama. Retrieval-augmented generation is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. If you have a source of truth that isn’t in the training data, it is a good way to get the model to know about it. Let’s get started!

Your RAG will need a model (like llama3 or mistral), an embedding model (like mxbai-embed-large), and a vector database. The vector database contains relevant documentation to help the model answer specific questions better. For this demo, our vector database is going to be Chroma DB. You will need to “chunk” the text you are feeding into the database. Let’s start there.

Chunking

There are many ways of choosing the right chunk size and overlap but for this demo, I am just going to use a chunk size of 7500 characters and an overlap of 100 characters. I am also going to use LangChain‘s CharacterTextSplitter to do the chunking. It means that the last 100 characters in the value will be duplicated in the next database record.

The Vector Database

A vector database is a type of database designed to store, manage, and manipulate vector embeddings. Vector embeddings are representations of data (such as text, images, or sounds) in a high-dimensional space, where each data item is represented as a dense vector of real numbers. When you query a vector database, your query is transformed into a vector of real numbers. The database then uses this vector to perform similarity searches.

https://i0.wp.com/jws.news/wp-content/uploads/2024/05/Screenshot-2024-05-08-at-2.36.49%E2%80%AFPM.png?resize=665%2C560&ssl=1

You can think of it as being like a two-dimensional chart with points on it. One of those points is your query. The rest are your database records. What are the points that are closest to the query point?

Embedding Model

To do this, you can’t just use an Ollama model. You need to also use an embedding model. There are three that are available to pull from the Ollama library as of the writing of this. For this demo, we are going to be using nomic-embed-text.

Main Model

Our main model for this demo is going to be phi3. It is a 3.8B parameters model that was trained by Microsoft.

LangChain

You will notice that today’s demo is heavily using LangChain. LangChain is an open-source framework designed for developing applications that use LLMs. It provides tools and structures that enhance the customization, accuracy, and relevance of the outputs produced by these models. Developers can leverage LangChain to create new prompt chains or modify existing ones. LangChain pretty much has APIs for everything that we need to do in this app.

The Actual App

Before we start, you are going to want to pip install tiktoken langchain langchain-community langchain-core. You are also going to want to ollama pull phi3 and ollama pull nomic-embed-text. This is going to be a CLI app. You can run it from the terminal like python3 app.py "<Question Here>".

You also need a sources.txt file containing the URLs of things that you want to have in your vector database.

So, what is happening here? Our app.py file is reading sources.txt to get a list of URLs for news stories from Tuesday’s Apple event. It then uses WebBaseLoader to download the pages behind those URLs, uses CharacterTextSplitter to chunk the data, and creates the vectorstore using Chroma. It then creates and invokes rag_chain.

Here is what the output looks like:

https://i0.wp.com/jws.news/wp-content/uploads/2024/05/Screenshot-2024-05-08-at-4.09.36%E2%80%AFPM.png?resize=1024%2C845&ssl=1

The May 7th event is too recent to be in the model’s training data. This makes sure that the model knows about it. You could also feed the model company policy documents, the rules to a board game, or your diary and it will magically know that information. Since you are running the model in Ollama, there is no risk of that information getting out, too. It is pretty awesome.

Have any questions, comments, etc? Feel free to drop a comment, below.

https://jws.news/2024/how-to-build-a-rag-system-using-python-ollama-langchain-and-chroma-db/

#AI #ChromaDB #Chunking #LangChain #LLM #Ollama #Python #RAG

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ThePSF, 14 hours ago to python

Will you be at @pycon US this year? Join the fun and sign up to volunteer at the PSF Booth (or another location) for a couple of hours! Volunteering at #PyConUS is a great way to meet awesome folks in our community 🫶 #python
https://us.pycon.org/2024/volunteers/volunteering/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ devs, Yhg1s, pycon, pythonbynight +2 more

fohrloop, 14 hours ago to python

The new python 3.13 REPL looks so useful that I might be able to switch from IPython to it entirely!

#python #python313 #pythonnews

https://treyhunner.com/2024/05/my-favorite-python-3-dot-13-feature/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ hugovk, danjac, anze3db

skribe, 15 hours ago to django

Django peeps. I want to link my languages table (English, French, Chinese, etc) to the word classes (Nouns, Verbs, Adjectives, etc) table. It would be a many-to-many relationship, but I'm not sure whether to use a join table or the many-to-many model. What's the most Django way?

#Django #Python #SQL #Question

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ CodenameTim

marsbarlee, 19 hours ago to python

Woah! I’m giving a talk at #PyConUS titled “Paint by Numbers: A Retrospective on the ‘NumPy Comics’ and Under-Represented Skillsets in Documentation”.

A refreshingly honest tell-all on what went right, what went wrong and what went horribly wrong. 🥲 Check it out at the Documentation Summit, Sunday, May 19!

If you’ve never heard of the NumPy comics, check them out here! https://heyzine.com/flip-book/3e66a13901.html

#pycon #python #numpy #opensource

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ hugovk

jackwilliambell, 22 hours ago to python

> "#Python 3.13 just hit feature freeze with the first beta release today. Just before the feature freeze, a shiny new feature was added: a brand new Python REPL."

> My favorite Python 3.13 feature. https://treyhunner.com/2024/05/my-favorite-python-3-dot-13-feature/

They finally fixed the 'exit' thing! Yay! 🎉 🥳 🎈

#programming

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

treyhunner, 1 day ago to python

You may be wondering, why is the start index included, but the stop index is excluded?

Read the full article: List slicing in Python
▸ https://trey.io/ZEuawA

#Python

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

AFPy, 1 day ago to python French

🐍 Le compte à rebours est lancé ! Le Call for Proposals pour la PyConFR est ouvert jusqu'au 21 juillet 23:59 (Europe/Paris). #PyConFR #Python

https://cfp.pycon.fr/pyconfr-2024/cfp

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ mmu_man, aeveltstra

JordiGH, 1 day ago to python

This is the first time I work for a company that I actively want to personally advertise for, but I really like what #Grist does. It's like a #spreadsheet that's really a #database and lets you use #Python as a computational language. And it's all open source!

Have some links!

https://getgrist.com

https://github.com/gristlabs/grist-core

And have some marketing materials!

The drafts were really good too:

https://docs.google.com/presentation/d/1kzmv4o2ZqRYeWPq_Y9LkS50whzcMrL_gn1zoV6EnJQA/

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

kellogh, 1 day ago to python

what's the word when, in #python, the declared types are wrong, so you have to butcher the code with assert statements? dark types?

reply

expand (6)

collapse (6)

report

activity

copy /kbin url

copy original url

open original url

Loading...

pytexas, 1 day ago to python

Did you miss the PyTexas conference a few weeks ago? If so, no worries! The videos are now live! https://www.youtube.com/playlist?list=PL0MRiRrXAvRjMAfx42eiokiAmfclUX-6S

#PyTexas
#Python
#pythonprogramming

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ djpeacher, ehmatthes

mahryekuh, 1 day ago to python

Speaking of #PyGrunn, will anyone here be present at this year's edition?

https://pygrunn.org/

#Python #conference #Groningen #Netherlands

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

mblayman, 1 day ago to python

Want to build CLIs in #Python and find argparse hard to use?

Mike Atkinson showed us at Python Frederick how to the use the excellent Click package to make CLI tools. https://www.youtube.com/watch?v=uXS9hmp4lp4

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bbelderbos, 1 day ago to python

Custom exceptions boost code clarity and intent in #Python, aligning with the Zen's "explicit is better than implicit."

Check out this clean example from the pyjokes package:

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ pybites

hugovk, 1 day ago to python

Next month, @the_compiler is organising a pytest sprint in Austria, next to the Swiss border.

There's also a possibility for paid travel/accommodation.

See https://github.com/pytest-dev/sprint for more info and signup.
#Python #pytest #sprint

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

scy, 1 day ago to python

Imports in Python can be confusing.

I just saw someone ask "why do I have to prepend a dot to import a file in the same directory?"

That depends on whether the file with the import statement is in a package or not.

But, whether Python considers it to be in a package depends on how you imported (or ran) that file. You can't determine it from the file's content or the filesystem structure!

Check alt text (image description) for explanations of the examples.

#Python #import

Running four commands from the parent directory. "python3 mypkg/abs_import.py" prints "in pkg", because Python found mymod.py in the same directory as the file it has been asked to run. That's because sys.path (which is used to define search order for modules) is initialized like this (quoting from the docs): "The first entry in the module search path is the directory that contains the input script, if there is one. Otherwise, the first entry is the current directory, which is the case when executing the interactive shell, a -c command, or -m module." "python3 mypkg/rel_import.py" throws an ImportError: "attempted relative import with no known parent package". Just because you're running a file in a subdirectory doesn't make this directory a package. "python3 -m mypkg.rel_import" prints "in pkg", because Python is now interpreting mypkg as a package name, has found the rel_import module in that package, and is able to do relative imports from there. "python3 -m mypkg.abs_import" prints "top level". Remember the documentation from above: If there is no input script (and there isn't, we're asking Python to resolve and run a module instead), sys.path will first look in the current directory, i.e. the one containing the mypkg package, because that's the one we're currently in.
We now change into the mypkg directory with a "cd" command. "python3 abs_import.py" prints "in pkg", because Python is going to look for "mymod" in the directory containing the input script (which happens to be the current directory, but that's not relevant). "python3 rel_import.py" throws an ImportError "attempted relative import with no known parent package" again. Understandably, because Python has no way of knowing that the directory we're currently in can be interpreted as a package. "python3 -m abs_import" prints "in pkg", because sys.path first looks for mymod the current directory. "python3 -m rel_import" raises an ImportError "attempted relative import with no known parent package" again. That's because, in contrast to what we did in the last example in the previous screenshot, we're now just using "rel_import" as the module name we're asking Python to run, without the "mypkg" prefix. Adding the prefix wouldn't work, because our current directory is already inside the mypkg package and Python (correctly) wouldn't find another "mypkg" directory in it. But without the prefix, Python doesn't know that the "rel_import" module resides in a package at all.
We now move into the parent directory again ("cd ..") and delete the top-level mymod.py file. Then, we attempt the examples from the second image again. "python3 mypkg/abs_import.py" prints "in pkg" as before, because Python found mymod.py in the same directory as the file it has been asked to run. "python3 mypkg/rel_import.py" throws an ImportError: "attempted relative import with no known parent package", just like before, because it interprets the file path as a script to run, not as a module in a package. "python3 -m mypkg.rel_import" prints "in pkg", just like before, because Python is interpreting mypkg as a package name, has found the rel_import module in that package, and is able to do relative imports from there. "python3 -m mypkg.abs_import" throws a ModuleNotFoundError "no module named 'mymod'. Before, it printed "top level", but now we have deleted the top-level mymod.py file that it was importing.

reply

expand (12)

collapse (12)

report

activity

copy /kbin url

copy original url

open original url

Loading...

henryiii, 1 day ago to python

Google fired their Python team, including one of our pybind11 lead developers (the list of accomplishments of that team is, ah, was, impressive!) We'll need to tighten up our min version support for pybind11, so I've opened up a poll: https://github.com/pybind/pybind11/discussions/5124 3.7+ or 3.8+? #python

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ webology, vstinner

pkw, 1 day ago to python

#python fixture config is magic and I don't like it.

def test_something(fixture):
...

So in pytest. What this does is get the name of the param fixture to see if it matches the name of a previously defined fixture function. If you don't know that it looks bizarre. That IS NOT a parameter passed into a function but a sentinel that is used to look up a fixture by it's parameter name.

WHY not just pass in the ACTUAL FIXTURE ?!?!

def test_something(fixtures=[fixture1, fixture2]):
...

reply

expand (10)

collapse (10)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ meejah