cross-posted from: https://lemmy.world/post/76020

Greetings Reddit Refugees!

I hope your migration is going well! If you haven’t been here before, Welcome to FOSAI! Your new Lemmy landing page for all things artificial intelligence.

This is a follow-up post to my first Welcome Message.

Here I will share insights and instructions on how to set up some of the tools and applications in the aforementioned AI Suite.

Please note that I did not develop any of these, but I do have each one of them working on my local PC, which I interface with regularly. I will plan to do posts exploring each software in detail - but first - let’s get a better idea what we’re working with.

As always, please don’t hesitate to comment or share your thoughts if I missed something (or you want to understand or see a concept in more detail).

Getting Started with FOSAI

What is oobabooga?

How-To-Install-oobabooga

In short, oobabooga is a free and open source web client someone (oobabooga) made to interface with HuggingFace LLMs (large language models). As far as I understand, this is the current standard for many AI tinkerers and those who wish to run models locally. This client allows you to easily download, chat, and configure with text-based models that behave like Chat-GPT, however, not all models on HuggingFace are at the same level of Chat-GPT out-of-the-box. Many require ‘fine-tuning’ or ‘training’ to produce consistent, coherent results. The benefit using HuggingFace (instead of Chat-GPT) is that you have much more options to choose from regarding your AI model, including the option to choose a censored or uncensored version of a model, untrained or pre-trained, etc. Oobabooga is an interface that let’s you do all this (theoretically), but can have a bit of a learning curve if you don’t know anything about AI/LLMs.

What is gpt4all?

How-To-Install-gpt4all

gpt4all is the closest thing you can currently download to have a Chat-GPT style interface that is compatible with some of the latest open-source LLM models available to the community. Some models can be downloaded in quantized formats, unquantized formats, and base formats (which typically run GPU only), but there are new model formats that are emerging (GGML), which enable GPU + CPU compute. This GGML format seems to be the growing standard for consumer-grade hardware. Some prefer the user experience of gpt4all over oobabooga, and some feel the exact opposite. For me - I prefer the options oobabooga provides - so I use that as my ‘daily driver’ while gpt4all is a backup client I run for other tests.

What is Koboldcpp?

How-To-Install-Koboldcpp

Koboldcpp, like oobabooga and gpt4all is another web-based interface you can run to chat with LLMs locally. It enables GGML inference, which can be hard to get running on oobabooga depending on the version of your client and updates from the developer. Koboldcpp, however, is part of a totally different platform and team of developers who typically focus on the roleplaying aspect of generative AI and LLMs. Koboldcpp feels more like NovelAI than anything I’ve ran locally, and has similar functionality and vibes as AI Dungeon. In fact, you can download some of the same models and settings that they use to emulate something very similar (but 100% local, assuming you have capable hardware).

What is TavernAI?

How-To-Install-TavernAI

TavernAI is a customized web-client that seems as functional as gpt4all in most regards. You can use TavernAI to connect with Kobold’s API - as well as insert your own Chat-GPT API key to talk with OpenAI’s GPT-3 (and GPT-4 if you have API access).

What is Stable Diffusion?

How-To-Install-StableDiffusion (Automatic1111)

Stable Diffusion is a groundbreaking and popular AI model that enables text to image generation. When someone thinks of “Stable Diffusion” people tend to picture Automatic1111’s UI/UX, which is the same interface oobabooga is inspired by. This UI/UX has become the defacto standard for almost all Stable Diffusion workflows. Fun factoid - it is widely believed MidJourney is a highly tuned version of a Stable Diffusion model, but one who’s weights, LoRAs, and configurations made closed-source after training and alignment.

What is ControlNet?

How-To-Install-ControlNet

ControlNet is a way you can manually control models of Stable Diffusion, allowing you to have complete freedom over your generative AI workflow. The best example of what this is (and what it can do) can be seen in this video. Notice how it combines an array of tools you can use as pre-processors for your prompts, enhancing the composition of your image by giving you options to bring out any detail you wish to manifest.

What is TemporalKit?

How-To-Install-TemporalKit

This is another Stable Diffusion extension that allows you to create custom videos using generative AI. In short, it takes an input video and chops them into dozens (or hundreds) of frames that can then be batch edited with Stable Diffusion, amassing new key frames and sequences which are stitched back together with EbSynth using your new images, resulting a stylized video that was generated and edited based on your Stable Diffusion prompt/workflow.

Where to Start?

Unsure where to begin? Do you have no idea what you’re doing? Or have paralysis by analysis? That’s okay, we’ve all been there.

Start small, don’t install everything at once, and instead, ask yourself what sounds like the most fun? Pick one of the tools I’ve mentioned above and spend as much time as you need to get it working. This work takes patience, cultivation, and motion. The first two parts of that (patience, cultivation) typically take the longest to get over.

If you end up at your wit’s end installing or troubleshooting these tools - remind yourself this is bleeding edge artificial intelligence technology. It shouldn’t be easy in these early phases. The good news is I have a strong feeling it will become easier than any of us could imagine over time. If you cannot get something working, consider posting your issue here with information regarding your problem.

To My Esteemed Lurkers…

If you’re a lurker (like I used to be), please consider taking a popcorn break and stepping out of your shell, making a post, and asking questions. This is a safe space to share your results and interests with AI - or make a post about your epic project or goal. All progress is welcome here, all conversations about this tech are fair and waiting to be discussed.

Over the course of this next week I will continue releasing general information to catch this community up to some of its more-established counterparts.

Consider subscribing if you liked the content of this post or want to stay in the loop with Free, Open-Source Artificial Intelligence.

  • Blaed@lemmy.worldOP
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 year ago

    FWIW, it’s a new term I am trying to coin in FOSS communities (Free, Open-Source Software communities). It’s a spin off of ‘FOSS’, but for AI.

    There’s literally nothing wrong with FOSS as an acronym, I just wanted to use one more focused in regards to AI tech to set the right expectations for everything shared in /c/FOSAI

    I felt it was a term worth coining given the varied requirements and dependancies AI/LLMs tend to have compared to typical FOSS stacks. Making this differentiation is important in some of the semantics these conversations carry.