SDXL. Add Review. py file that removes the need of adding "--precision full --no-half" for NVIDIA GTX 16xx cards. Just check your vram and be sure optimizations like xformers are set-up correctly because others UI like comfyUI already enable those so you don't really feel the higher vram usage of SDXL. -opt-sdp-no-mem-attention --upcast-sampling --no-hashing --always-batch-cond-uncond --medvram. You'd need to train a new SDXL model with far fewer parameters from scratch, but with the same shape. user. 5gb. I have always wanted to try SDXL, so when it was released I loaded it up and surprise, 4-6 mins each image at about 11s/it. I tried comfyUI and it takes about 30s to generate 768*1048 images (i have a RTX2060, 6GB vram). Disables the optimization above. ago. You may experience it as “faster” because the alternative may be out of memory errors or running out of vram/switching to CPU (extremely slow) but it works by slowing things down so lower memory systems can still process without resorting to CPU. For a few days life was good in my AI art world. I have tried rolling back the video card drivers to multiple different versions. Many of the new models are related to SDXL, with several models for Stable Diffusion 1. ComfyUIでSDXLを動かすメリット. --medvram Makes the Stable Diffusion model consume less VRAM by splitting it into three parts - cond (for transforming text into numerical representation), first_stage (for converting a picture into latent space and back), and unet (for actual denoising of latent space) and making it so that only one is in VRAM at all times, sending others to. Python doesn’t work correctly. Reply. Introducing our latest YouTube video, where we unveil the official SDXL support for Automatic1111. 1. I found on the old version some times a full system reboot helped stabilize the generation. 1+cu118 • xformers: 0. Took 33 minutes to complete. You using --medvram? I have very similar specs btw, exact same gpu usually i dont use --medvram for normal SD1. Things seems easier for me with automatic1111. Not op, but using medvram makes stable diffusion really unstable in my experience, causing pretty frequent crashes. 6 and the --medvram-sdxl Image size: 832x1216, upscale by 2 DPM++ 2M, DPM++ 2M SDE Heun Exponential (these are just my usuals, but I have tried others) Sampling steps: 25-30 Hires. Workflow Duplication Issue Resolved: The team has resolved an issue where workflow items were being run twice for PRs from the repo. 5 minutes with Draw Things. I'm using a 2070 Super with 8gb VRAM. My computer black screens until I hard reset it. --opt-sdp-attention:启用缩放点积交叉注意层. half()), the resulting latents can't be decoded into RGB using the bundled VAE anymore without producing the all-black NaN tensors?For 20 steps, 1024 x 1024,Automatic1111, SDXL using controlnet depth map, it takes around 45 secs to generate a pic with my 3060 12G VRAM, intel 12 core, 32G Ram ,Ubuntu 22. Thats why i love it. the A1111 took forever to generate an image without refiner the UI was very laggy I did remove all the extensions but nothing really change so the image always stocked on 98% I don't know why. py is a script for SDXL fine-tuning. The suggested --medvram I removed it when i upgraded from RTX2060-6GB to RTX4080-12GB (both Laptop/Mobile). I'm using a 2070 Super with 8gb VRAM. 5gb to 5. In my v1. EDIT: Looks like we do need to use --xformers, I tried without but this line wouldn't pass meaning that xformers wasn't properly loaded and errored out, to be safe I use both arguments now, although --xformers should be enough. bat" asset COMMANDLINE_ARGS= --precision full --no-half --medvram --opt-split-attention (means you start SD from webui-user. Let's dive into the details! Major Highlights: One of the standout additions in this update is the experimental support for Diffusers. You can make it at a smaller res and upscale in extras though. この記事ではSDXLをAUTOMATIC1111で使用する方法や、使用してみた感想などをご紹介します。. I have same GPU and trying picture size beyond 512x512 it gives me Runtime error, "There is not enough GPU video memory". On my 3080 I have found that --medvram takes the SDXL times down to 4 minutes from 8 minutes. I'm on Ubuntu and not Windows. --medvram: None: False: Enable Stable Diffusion model optimizations for sacrificing a some performance for low VRAM usage. For 1 512*512 it takes me 1. Wow Thanks; it works! From the HowToGeek :: How to Fix Cuda out of Memory section :: command args go in webui-user. But if you have an nvidia card, you should be running xformers instead of those two. Refiner same folder as Base model, although with refiner i can't go higher then 1024x1024 in img2img. Even though Tiled VAE works with SDXL - it still has a problem that SD 1. For a few days life was good in my AI art world. 0. But these arguments did not work for me, --xformers gave me a minor bump in performance (8s/it. I posted a guide this morning -> SDXL 7900xtx and Windows 11, I. See more posts like this in r/StableDiffusionPS medvram giving me errors and just wont go higher than 1280x1280 so i dont use it. I've also got 12GB and with the introduction of SDXL, I've gone back and forth on that. The SDXL works without it. bat. I must consider whether I should use without medvram. Things seems easier for me with automatic1111. 0, the various. 🚀Announcing stable-fast v0. Reply. It initially couldn't load the weight but then I realized my Stable Diffusion wasn't updated to v1. set COMMANDLINE_ARGS=--xformers --opt-split-attention --opt-sub-quad-attention --medvram set PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0. SDXL is definitely not 'useless', but it is almost aggressive in hiding nsfw. I've tried adding --medvram as an argument, still nothing. modifier (I have 8 GB of VRAM). medvram and lowvram Have caused issues when compiling the engine and running it. ago. --lowram: None: False: Load Stable Diffusion checkpoint weights to VRAM instead of RAM. Generated enough heat to cook an egg on. I can generate 1024x1024 in A1111 in under 15 seconds, and using ComfyUI it takes less than 10 seconds. SDXL and Automatic 1111 hate eachother. add --medvram-sdxl flag that only enables --medvram for SDXL models; prompt editing timeline has separate range for first pass and hires-fix pass (seed breaking change) Minor: img2img batch: RAM savings, VRAM savings, . 60 から Refiner の扱いが変更になりました。. After that SDXL stopped all problems, load time of model around 30sec Reply reply Perspective-CarelessDisabling "Checkpoints to cache in RAM" lets the SDXL checkpoint load much faster and not use a ton of system RAM. I just loaded the models into the folders alongside everything. For example, you might be fine without --medvram for 512x768 but need the --medvram switch to use ControlNet on 768x768 outputs. Contraindicated. SDXL base has a fixed output size of 1. 0 model as well as the new Dreamshaper XL1. Since SDXL came out I think I spent more time testing and tweaking my workflow than actually generating images. I have trained profiles using both medvram options enabled and disabled but the. I think the problem of slowness may be caused by not enough RAM (not VRAM) xPiNGx • 2 mo. 5. finally , AUTOMATIC1111 has fixed high VRAM issue in Pre-release version 1. ) -cmdflag (like --medvram-sdxl. SDXL is. Shortest Rail Distance: 17 km. Don't give up, we have the same card and it worked for me yesterday, i forgot to mention, add --medvram and --no-half-vae argument i had --xformerd too prior to sdxl. 1 until you like it. For 1 512*512 it takes me 1. There is also another argument that can help reduce CUDA memory errors, I used it when I had 8GB VRAM, you'll find these launch arguments at the github page of A1111. py in the stable-diffusion-webui folder. 添加--medvram-sdxl仅适用--medvram于 SDXL 型号的标志. (For SDXL models) Descriptions; Affected Web-UI / System: SD. The prompt was a simple "A steampunk airship landing on a snow covered airfield". environ. I shouldn't be getting this message from the 1st place. The suggested --medvram I removed it when i upgraded from RTX2060-6GB to RTX4080-12GB (both Laptop/Mobile). 9 / 1. The post just asked for the speed difference between having it on vs off. Myself, I've only tried to run SDXL in Invoke. commandline_args = os. 1600x1600 might just be beyond a 3060's abilities. add --medvram-sdxl flag that only enables --medvram for SDXL models prompt editing timeline has separate range for first pass and hires-fix pass (seed breaking change). Specs: 3060 12GB, tried both vanilla Automatic1111 1. bat file set COMMANDLINE_ARGS=--precision full --no-half --medvram --always-batch. Also 1024x1024 at Batch Size 1 will use 6. I find the results interesting for comparison; hopefully others will too. This allows the model to run more. Speed Optimization. With a 3090 or 4090 you're fine but that's also where you'd add --medvram if you had a midrange card or --lowvram if you wanted/needed. I applied these changes ,but it is still the same problem. ReVision is high level concept mixing that only works on. 9 model for Automatic1111 WebUI My card Geforce GTX 1070 8gb I use A1111. 3s/it on an M1 mbp with 32gb ram, using invokeAI, for sdxl 1024x1024 with refiner. 0 Alpha 2, and the colab always crashes. • 3 mo. Hello everyone, my PC currently has a 4060 (the 8GB one) and 16GB of RAM. Don't turn on full precision or medvram if you want max speed. 6 I couldn't run SDXL in A1111 so I was using ComfyUI. 0. Oof, what did you try to do. 1600x1600 might just be beyond a 3060's abilities. I use a 2060 with 8 gig and render SDXL images in 30s at 1k x 1k. MASSIVE SDXL ARTIST COMPARISON: I tried out 208 different artist names with the same subject prompt for SDXL. The recommended way to customize how the program is run is editing webui-user. process_api( File "E:stable-diffusion-webuivenvlibsite. In. First Impression / Test Making images with SDXL with the same Settings (size/steps/Sampler, no highres. In my v1. Only makes sense together with --medvram or --lowvram--opt-channelslast: Changes torch memory type for stable diffusion to channels last. add --medvram-sdxl flag that only enables --medvram for SDXL models; prompt editing timeline has separate range for first pass and hires-fix pass (seed breaking change) Minor: img2img batch: RAM savings, VRAM savings, . and this Nvidia Control. 9, causing generator stops for minutes aleady add this line to the . . I only see a comment in the changelog that you can use it but I am not. My 4gig 3050 mobile takes about 3 min to do 1024 x 1024 SDXL in A1111. Google Colab/Kaggle terminates the session due to running out of RAM #11836. I have 10gb of vram and I can confirm that it's impossible without medvram. It functions well enough in comfyui but I can't make anything but garbage with it in automatic. 11. the problem is when tried to do "hires fix" (not just upscale, but sampling it again, denoising and stuff, using K-Sampler) of that to higher resolution like FHD. 5 models your 12gb vram should never need the medvram setting since cost some generation speed and for very large upscaling there is several ways to upscale by use of tiles to which the 12gb is more than enough. If you have more VRAM and want to make larger images than you can usually make (e. I have always wanted to try SDXL, so when it was released I loaded it up and surprise, 4-6 mins each image at about 11s/it. Before SDXL came out I was generating 512x512 images on SD1. Long story short, I had to add --disable-model. Step 2: Create a Hypernetworks Sub-Folder. 0 Version in Automatic1111 installiert und nutzen könnt. 1. 9 You must be logged in to vote. ) But any command I enter results in images like this (SDXL 0. 048. Normally the SDXL models work fine using medvram option, taking around 2 it/s, but when i use Tensor RT profile for SDXL, it seems like the medvram option is not being used anymore as the iterations start taking several minutes as if the medvram option is disabled. 0_0. What a move forward for the industry. I tried comfyui, 30 sec faster on a 4 batch, but it's pain in the ass to make the workflows you need, and just what you need (IMO). Stable Diffusion XL(通称SDXL)の導入方法と使い方. try --medvram or --lowvram Reply More posts you may like. 0 repliesIt's amazing - I can get 1024x1024 SDXL images in ~40 seconds at 40 iterations euler A with base/refiner with the medvram-sdxl flag enabled now. The “sys” will show the VRAM of your GPU. Reply reply more replies. I have the same issue, got an Arc A770 too so i guess the card is the problem. x). It takes now around 1 min to generate using 20 steps and the DDIM sampler. --opt-channelslast. Also, as counterintuitive as it might seem, don't generate low resolution images, test it with 1024x1024 at least. Si vous avez moins de 8 Go de VRAM sur votre GPU, il est également préférable d'activer l'option --medvram pour économiser la mémoire, afin de pouvoir générer plus d'images à la fois. Medvram sacrifice a little speed for more efficient use of VRAM. When I tried to gen an image it failed and gave me the following lines. set COMMANDLINE_ARGS= --medvram --upcast-sampling --no-half. I have my VAE selection in the settings set to. SDXL 系はVer3に相当する最新バージョンですが、2系の正当進化として界隈でもわりと好意的に受け入れられ、新しい派生モデルも作られ始めています. The VRAM usage seemed to. set COMMANDLINE_ARGS=--xformers --medvram. pth (for SDXL) models and place them in the models/vae_approx folder. whl file to the base directory of stable-diffusion-webui. 5GB vram and swapping refiner too , use --medvram-sdxl flag when starting r/StableDiffusion • [WIP] Comic Factory, a web app to generate comic panels using SDXLSeems like everyone is liking my guides, so I'll keep making them :) Today's guide is about VAE (What It Is / Comparison / How to Install), as always, here's the complete CivitAI article link: Civitai | SD Basics - VAE (What It Is / Comparison / How to. I installed SDXL in a separate DIR but that was super slow to generate an image, like 10 minutes. 6. . . r/StableDiffusion. PVZ82 opened this issue Jul 31, 2023 · 2 comments Open. The SDXL works without it. In your stable-diffusion-webui folder, create a sub-folder called hypernetworks. And when it does show it, it feels like the training data has been doctored, with all the nipple-less breasts and barbie crotches. 2 / 4. I've been trying to find the best settings for our servers and it seems that there are two accepted samplers that are recommended. get (COMMANDLINE_ARGS, "") Now in the quotations copy and paste whatever arguments you need to incude whenever starting the program. The generation time increases by about a factor of 10. Workflow Duplication Issue Resolved: The team has resolved an issue where workflow items were being run twice for PRs from the repo. 5 checkpoints Yeah 8gb is too little for SDXL outside of ComfyUI. 👎 2 Daxiongmao87 and Nekos4Lyfe reacted with thumbs down emojiWhen generating, the gpu ram usage goes from about 4. I can confirm the --medvram option is what I needed on a 3070m 8GB. 1 Click on an empty cell where you want the SD to be. 0. Question about ComfyUI since it's the first time i've used it, i've preloaded a worflow from SDXL 0. takes about a minute to generate a 512x512 image without highrez fix using --medvram while my newer 6gb card takes less than 10. Autoinstaller. Funny, I've been running 892x1156 native renders in A1111 with SDXL for the last few days. Put the base and refiner models in stable-diffusion-webuimodelsStable-diffusion. So I've played around with SDXL and despite the good results out of the box, I just can't deal with the computation times (3060 12GB): With 1. bat" asなお、SDXL使用時のみVRAM消費量を抑えられる「--medvram-sdxl」というコマンドライン引数も追加されています。 通常時はmedvram使用せず、SDXL使用時のみVRAM消費量を抑えたい方は設定してみてください。 AUTOMATIC1111 ver1. I had been used to . We highly appreciate your help if you can share a screenshot in this format: GPU (like RGX 4096, RTX 3080,. I've seen quite a few comments about people not being able to run stable diffusion XL 1. 9 through Python 3. Before I could only generate a few SDXL images and then it would choke completely and generating time increased to like 20min or so. • 1 mo. set COMMANDLINE_ARGS= --medvram --upcast-sampling --no-half --precision full . 5, now I can just use the same one with --medvram-sdxl without having to swap. 6. --lowram: None: False With my card I use Medvram option for SDXL. Then things updated. Yes, less than a GB of VRAM usage. All reactions. If I do a batch of 4, it's between 6 or 7 minutes. I had to set --no-half-vae to eliminate errors and --medvram to get any upscalers other than latent to work, have not tested them all, only LDSR and R-ESRGAN 4X+. tif, . aiイラストで一般人から一番口を出される部分が指の崩壊でしたので、そのあたりの改善の見られる sdxl は今後主力になっていくことでしょう。 今後もAIイラストを最前線で楽しむ為にも、一度導入を検討されてみてはいかがでしょうか。My GTX 1660 Super was giving black screen. 0-RC , its taking only 7. 6: with cuda_alloc_conf and opt. (Also why should i delete my yaml files ?)Unfortunately yes. 1: 6. Got it updated and the weight was loaded successfully. Jumped to 24 GB during final rendering. • 8 mo. This will pull all the latest changes and update your local installation. 動作が速い. Unreserved. I have a RTX3070 8GB and A1111 SDXL works flawless with --medvram and. With SDXL every word counts, every word modifies the result. com) and it works fine with 1. 4 used and the rest free. About this version. ダウンロード. 5 gets a big boost, I know there's a million of us out. 1. I think it fixes at least some of the issues. My hardware is Asus ROG Zephyrus G15 GA503RM with 40GB RAM DDR5-4800, two M. I had to set --no-half-vae to eliminate errors and --medvram to get any upscalers other than latent to work, have not tested them all, only LDSR and R-ESRGAN 4X+. 6 and have done a few X/Y/Z plots with SDXL models and everything works well. SDXL is a lot more resource intensive and demands more memory. PVZ82 opened this issue Jul 31, 2023 · 2 comments Open. pretty much the same speed i get from ComfyUI edit: I just made a copy of the . 4GB の VRAM があり、512x512 の画像を作成したいが、-medvram ではメモリ不足のエラーが発生する場合、代わりに --medvram --opt-split-attention. . Two models are available. The 32G model doesn't need low/medvram, especially if you use ComfyUI; the 16G model probably will, especially if you run it. So it’s like taking a cab, but sitting in the front seat or sitting in the back seat. bat 打開讓它跑,應該要跑好一陣子。 2. Usually not worth the trouble for being able to do slightly higher resolution. If you have low iterations with 512x512, use --lowvram. py", line 422, in run_predict output = await app. 6,max_split_size_mb:128 git pull. Support for lowvram and medvram modes - Both work extremely well Additional tunables are available in UI -> Settings -> Diffuser Settings;Under windows it appears that enabling the --medvram (--optimized-turbo for other webuis) will increase the speed further. 3gb to work with and OOM comes swiftly after. 1: 6. json. You may experience it as “faster” because the alternative may be out of memory errors or running out of vram/switching to CPU (extremely slow) but it works by slowing things down so lower memory systems can still process without resorting to CPU. A little slower and kinda like Blender with the UI. 提示编辑时间线具有单独的第一次通过和雇用修复通过(种子破坏更改)的范围(#12457) 次要的: img2img 批处理:img2img 批处理中的 RAM 节省、VRAM 节省、. Cannot be used with --lowvram/Sequential CPU offloading. 3 / 6. fix) is about 14% slower than 1. Consumed 4/4 GB of graphics RAM. using medvram preset result in decent memory savings without huge performance hit: Doggetx: 0. Next is better in some ways -- most command lines options were moved into settings to find them more easily. I was using --MedVram and --no-half. 5, all extensions updated. プロンプト編集のタイムラインが、ファーストパスと雇用修正パスで別々の範囲になるように変更(seed breaking change) マイナー: img2img バッチ: img2imgバッチでRAM節約、VRAM節約、. whl, change the name of the file in the command below if the name is different:set COMMANDLINE_ARGS=--medvram --opt-sdp-attention --no-half --precision full --disable-nan-check --autolaunch --skip-torch-cuda-test set SAFETENSORS_FAST_GPU=1. 39. 67 Daily Trains. Before I could only generate a few. 0 on 8GB VRAM? Automatic1111 & ComfyUi. --force-enable-xformers:强制启动xformers,无论是否可以运行都不报错. tiff in img2img batch (#12120, #12514, #12515) postprocessing/extras: RAM savingswithout --medvram (but with xformers) my system was using ~10GB VRAM using SDXL. You should definitively try them out if you care about generation speed. 8 / 3. You definitely need to add at least --medvram to commandline args, perhaps even --lowvram if the problem persists. 5 512x768 5sec generation and with sdxl 1024x1024 20-25 sec generation, they just. (--opt-sdp-no-mem-attention --api --skip-install --no-half --medvram --disable-nan-check)RTX 4070 - have tried every variation of MEDVRAM , XFORMERS on and off and no change. More will likely be here in the coming weeks. There is also an alternative to --medvram that might reduce VRAM usage even more, --lowvram,. Before jumping on automatic1111 fault, enable xformers optimization and/or medvram/lowram launch option and come back to say the same thing. --medvram By default, the SD model is loaded entirely into VRAM, which can cause memory issues on systems with limited VRAM. I am using AUT01111 with an Nvidia 3080 10gb card, but image generations are like 1hr+ with 1024x1024 image generations. For 8GB vram, the recommended cmd flag is "--medvram-sdxl". If it is the hi-res fix option, the second image subject repetition is definitely caused by a too high "Denoising strength" option. In stable-diffusion-webui directory, install the . So SDXL is twice as fast, and SD1. r/StableDiffusion. I only use --xformers for the webui. 24GB VRAM. Do you have any tips for making ComfyUI faster, such as new workflows?We might release a beta version of this feature before 3. Fast Decoder Enabled: Fast Decoder Disabled: I've been having a headache with this problem for several days. 5 and 30 steps, and 6-20 minutes (it varies wildly) with SDXL. ) -cmdflag (like --medvram-sdxl. sdxl を動かす!Running without --medvram and am not noticing an increase in used RAM on my system, so it could be the way that the system is transferring data back and forth between system RAM and vRAM, and is failing to clear out the ram as it goes. tif, . pth (for SD1. using --lowvram sdxl can run with only 4GB VRAM, anyone? Slow progress but still acceptable, estimated 80 secs to completed. Horrible performance. Moved to Installation and SDXL. I was running into issues switching between models (I had the setting at 8 from using sd1. At all. with this --opt-sub-quad-attention --no-half --precision full --medvram --disable-nan-check --autolaunch I could have 800*600 with my 6600xt 8g, not sure if your 480 could make it. fix, I tried optimizing the PYTORCH_CUDA_ALLOC_CONF, but I doubt it's the optimal config for. @weajus reported that --medvram-sdxl resolves the issue, however this is not due to the usage of the parameter, but due to the optimized way A1111 now manages system RAM, therefore not running into the issue 2) any longer. Watch on Download and Install. I'm sharing a few I made along the way together with. SDXL on Ryzen 4700u (VEGA 7 IGPU) with 64GB Dram blue screens [Bug]: #215. 手順3:ComfyUIのワークフロー. 23年7月27日にStability AIからSDXL 1. It can produce outputs very similar to the source content (Arcane) when you prompt Arcane Style, but flawlessly outputs normal images when you leave off that prompt text, no model burning at all. Webui will inevitably support it very soon. ipinz added the enhancement label on Aug 24. It takes around 18-20 sec for me using Xformers and A111 with a 3070 8GB and 16 GB ram. 05s/it over 16g vram, I am currently using ControlNet extension and it worksYeah, I don't like the 3 seconds it takes to gen a 1024x1024 SDXL image on my 4090. 1024x1024 instead of 512x512), use --medvram --opt-split-attention. I haven't been training much for the last few months but used to train a lot, and I don't think --lowvram or --medvram can help with training. 0-RC , its taking only 7. tiff in img2img batch (#12120, #12514, #12515) postprocessing/extras: RAM savingsMedvram has almost certainly nothing to do with it. and nothing was good ever again. Beta Was this translation helpful? Give feedback. On Windows I must use. Disabling live picture previews lowers ram use, and speeds up performance, particularly with --medvram --opt-sub-quad-attention --opt-split-attention also both increase performance and lower vram use with either no, or slight performance loss AFAIK. 5. 3) , kafka, pantyhose. 5 checkpointsYeah 8gb is too little for SDXL outside of ComfyUI. 1 Picture in about 1 Minute. Who Says You Can't Run SDXL 1. . 1 File (): Reviews. 7gb of vram and generates an image in 16 seconds for sde karras 30 steps. Announcement in. To save even more VRAM set the flag --medvram or even --lowvram (this slows everything but alows you to render larger images). 업데이트되었는데요. Mine will be called gollum. Sdxl batch of 4 held steady at 18. I was running into issues switching between models (I had the setting at 8 from using sd1. を丁寧にご紹介するという内容になっています。. 1. bat file would help speed it up a bit. --medvram --opt-sdp-attention --opt-sub-quad-attention --upcast-sampling --theme dark --autolaunch amd pro yazılımıyla performans %50 oranında arttı. 2 / 4. See Reviews. You can make AMD GPUs work, but they require tinkering ; A PC running Windows 11, Windows 10, Windows 8. 5. SDXL on Ryzen 4700u (VEGA 7 IGPU) with 64GB Dram blue screens [Bug]: #215. ipinz changed the title [Feature Request]: [Feature Request]: "--no-half-vae-xl" on Aug 24. 画像生成AI界隈で非常に注目されており、既にAUTOMATIC1111で使用することが可能です。. This guide covers Installing ControlNet for SDXL model. Commandline arguments: Nvidia (12gb+) --xformers Nvidia (8gb) --medvram-sdxl --xformers Nvidia (4gb) --lowvram --xformers AMD (4gb) --lowvram --opt-sub-quad-attention + TAESD in settings Both rocm and directml will generate at least 1024x1024 pictures at fp16. SDXL liefert wahnsinnig gute. I have a 3090 with 24GB of Vram cannot do a 2x latent upscale of a SDXL 1024x1024 image without running out of Vram with the --opt-sdp-attention flag.