Same, but I’m a massive switch, so I’d love to help you with that <3
Same, but I’m a massive switch, so I’d love to help you with that <3
God I wish that was me
Gorgeous! Ever been fucked with a big girl cock? <3
Thanks :)
Is there a way for instance owners to set certain communities as NSFW by default? That would probably be helpful for people using apps that don’t have the NSFW tag option. Though in the meantime, I would say if you can’t tag through your app, you should post through the mobile site, it’s perfectly usable, and it’s the only way I interact with lemmy.
Fuckkk, I would let you do anything you wanted to me
You should be doing me instead 😤
The rules don’t specify whether or not transfem users can post oc here (obviously we’d be included under women, but in the interest of clarity, I figured it’s best to ask)
Love your tattoo!
This model doesn’t have the new llama.cpp k quantization, and there are better models that have come out since Pygmalion 13b, like guanaco (https://huggingface.co/TheBloke/guanaco-13B-GGML) and manticore (https://huggingface.co/TheBloke/manticore-13b-chat-pyg-GGML). For optimal ratio of performance to speed, I’d go with either q_3_K_L, quantization (for lower spec hardware) or any of the q_n_K_M quantization schemes, since those use higher bit quantization on the attention layers and perform better either the q_n_K_S models or the plain q_n models.
GGML also suppers GPU acceleration via Cuda now, so if you have an Nvidia GPU, add the flag “–n-GPU-layers N”, where n is the number of layers sent to the GPU, you want this as high as possible without crashing.
You should also add the flag “–threads N”, where N is the number of CPU threads you want to use, otherwise it’ll run single threaded and the performance will be Ass, N should be the number of physical CPU cores you have, maybe 1-2 higher, I’d play around with it a bit. Be careful though, if you’re able to fit all/most of the model in the GPU, - - threads should be 1 (for 100% in GPU) or close to it (80%+ in GPU).
These numbers aren’t exact and will depend on your hardware config, so tinker a bit. Also, if you’re getting speeds faster than you need for the 13B parameter models, I’d try out a 33b parameter model (use at least a q_3 one though, q_2_K 33b quality is a bit worse than some of the higher-bit quantized 13b models imo, and the performance isn’t that much better than a q_3 33B parameter model so it’s just not worth it). The quality of the output is much better, if you have good enough hardware or can tolerate the speed penalty.
Edit: Also manticore was trained on a cleaned version of the Pygmalion dataset, plus a bunch of other stuff, so it works like a drop-in replacement for Pygmalion if you have some application that depends on a model using Pygmalion prompt formats.
You too <3, that cock would absolutely destroy me
Honestly jerboa was kind of terrible last time I tried to use it, I’ve just been saving the webpage to my homescreen and using it as a web app. It’s faster and better than jerboa somehow. Not hate to the devs, app development is difficult, it’s just not in a usable state for me atm
Tysm! <3