Commit 8978347c authored by 孙傲's avatar 孙傲

初始化

parents
记录 `Stable Diffusion` 本地环境搭建过程,以及 `Stable Diffusion` 的使用方法。
以下步骤均为`Windows 64位`环境下的操作,推荐使用`Windows 11`
*.ps1 text eol=crlf
\ No newline at end of file
.vscode
.idea
venv
__pycache__
output/*
!output/.keep
py310
git
train/*
logs/*
sd-models/*
toml/autosave/*
!sd-models/put stable diffusion model here.txt
!logs/.keep
tests/
huggingface/hub/models*
\ No newline at end of file
[submodule "sd-scripts"]
path = sd-scripts
url = https://github.com/kohya-ss/sd-scripts.git
[submodule "frontend"]
path = frontend
url = https://github.com/hanamizuki-ai/lora-gui-dist
This diff is collapsed.
# LoRA-scripts
LoRA training scripts for [kohya-ss/sd-scripts](https://github.com/kohya-ss/sd-scripts.git)
✨NEW: Train GUI
![image](https://github.com/Akegarasu/lora-scripts/assets/36563862/0a2edcb8-023a-4fe6-8c92-2bad9ccab64c)
Follow the installation guide below to install the GUI, then run `run_gui.ps1`(windows) or `run_gui.sh`(linux) to start the GUI.
## Usage
### Clone repo with submodules
```sh
git clone --recurse-submodules https://github.com/Akegarasu/lora-scripts
```
### Required Dependencies
Python 3.10.8 and Git
### Windows
#### Installation
Run `install.ps1` will automaticilly create a venv for you and install necessary deps.
#### Train
Edit `train.ps1`, and run it.
### Linux
#### Installation
Run `install.bash` will create a venv and install necessary deps.
#### Train
Training script `train.sh` **will not** activate venv for you. You should activate venv first.
```sh
source venv/bin/activate
```
Edit `train.sh`, and run it.
#### TensorBoard
Run `tensorboard.ps1` will start TensorBoard at http://localhost:6006/
![](./assets/tensorboard-example.png)
[url "https://jihulab.com/affair3547/sd-scripts"]
insteadOf = https://github.com/kohya-ss/sd-scripts.git
[url "https://jihulab.com/affair3547/lora-gui-dist"]
insteadOf = https://github.com/hanamizuki-ai/lora-gui-dist
[url "https://jihulab.com/Akegarasu/lora-scripts"]
insteadOf = https://github.com/Akegarasu/lora-scripts
lora-scripts/assets/tensorboard-example.png

146 KB

import argparse
import os
import shutil
import subprocess
import sys
import webbrowser
import uvicorn
from typing import List
# from mikazuki.app import app
parser = argparse.ArgumentParser(description="GUI for stable diffusion training")
parser.add_argument("--host", type=str, default="127.0.0.1")
parser.add_argument("--port", type=int, default=28000, help="Port to run the server on")
parser.add_argument("--tensorboard-host", type=str, default="127.0.0.1", help="Port to run the tensorboard")
parser.add_argument("--tensorboard-port", type=int, default=6006, help="Port to run the tensorboard")
parser.add_argument("--dev", action="store_true")
def find_windows_git():
possible_paths = ["git\\bin\\git.exe", "git\\cmd\\git.exe", "Git\\mingw64\\libexec\\git-core\\git.exe"]
for path in possible_paths:
if os.path.exists(path):
return path
def prepare_frontend():
if not os.path.exists("./frontend/dist"):
print("Frontend not found, try clone...")
print("Checking git installation...")
if not shutil.which("git"):
if sys.platform == "win32":
git_path = find_windows_git()
if git_path is not None:
print(f"Git not found, but found git in {git_path}, add it to PATH")
os.environ["PATH"] += os.pathsep + os.path.dirname(git_path)
return
else:
print("Git not found, please install git first")
sys.exit(1)
subprocess.run(["git", "submodule", "init"])
subprocess.run(["git", "submodule", "update"])
def remove_warnings():
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
if sys.platform == "win32":
# disable triton on windows
os.environ["XFORMERS_FORCE_DISABLE_TRITON"] = "1"
os.environ["BITSANDBYTES_NOWELCOME"] = "1"
os.environ["PYTHONWARNINGS"] = "ignore::UserWarning"
def run_tensorboard():
print("Starting tensorboard...")
subprocess.Popen([sys.executable, "-m", "tensorboard.main", "--logdir", "logs", "--host", args.tensorboard_host, "--port", str(args.tensorboard_port)])
def check_dirs(dirs: List):
for d in dirs:
if not os.path.exists(d):
os.makedirs(d)
if __name__ == "__main__":
args, _ = parser.parse_known_args()
remove_warnings()
prepare_frontend()
run_tensorboard()
check_dirs(["toml/autosave", "logs"])
print(f"Server started at http://{args.host}:{args.port}")
if not args.dev:
webbrowser.open(f"http://{args.host}:{args.port}")
uvicorn.run("mikazuki.app:app", host=args.host, port=args.port, log_level="error")
command_file: null
commands: null
compute_environment: LOCAL_MACHINE
deepspeed_config: {}
distributed_type: 'NO'
downcast_bf16: 'no'
dynamo_backend: 'NO'
fsdp_config: {}
gpu_ids: all
machine_rank: 0
main_process_ip: null
main_process_port: null
main_training_function: main
megatron_lm_config: {}
mixed_precision: fp16
num_machines: 1
num_processes: 1
rdzv_backend: static
same_network: true
tpu_name: null
tpu_zone: null
use_cpu: false
1
\ No newline at end of file
$Env:HF_HOME = "huggingface"
$Env:PIP_DISABLE_PIP_VERSION_CHECK = 1
$Env:PIP_NO_CACHE_DIR = 1
function InstallFail {
Write-Output "安装失败。"
Read-Host | Out-Null ;
Exit
}
function Check {
param (
$ErrorInfo
)
if (!($?)) {
Write-Output $ErrorInfo
InstallFail
}
}
if (!(Test-Path -Path "venv")) {
Write-Output "正在创建虚拟环境..."
python -m venv venv
Check "创建虚拟环境失败,请检查 python 是否安装完毕以及 python 版本是否为64位版本的python 3.10、或python的目录是否在环境变量PATH内。"
}
.\venv\Scripts\activate
Check "激活虚拟环境失败。"
Set-Location .\sd-scripts
Write-Output "安装程序所需依赖 (已进行国内加速,若在国外或无法使用加速源请换用 install.ps1 脚本)"
$install_torch = Read-Host "是否需要安装 Torch+xformers? [y/n] (默认为 y)"
if ($install_torch -eq "y" -or $install_torch -eq "Y" -or $install_torch -eq ""){
pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 -f https://mirror.sjtu.edu.cn/pytorch-wheels/torch_stable.html -i https://mirror.baidu.com/pypi/simple
Check "torch 安装失败,请删除 venv 文件夹后重新运行。"
pip install -U -I --no-deps xformers==0.0.19 -i https://mirror.baidu.com/pypi/simple
Check "xformers 安装失败。"
}
pip install --upgrade -r requirements.txt -i https://mirror.baidu.com/pypi/simple
Check "其他依赖安装失败。"
pip install --upgrade lion-pytorch dadaptation -i https://mirror.baidu.com/pypi/simple
Check "Lion、dadaptation 优化器安装失败。"
pip install --upgrade lycoris-lora -i https://mirror.baidu.com/pypi/simple
Check "lycoris 安装失败。"
pip install --upgrade fastapi uvicorn -i https://mirror.baidu.com/pypi/simple
Check "UI 所需依赖安装失败。"
pip install --upgrade wandb -i https://mirror.baidu.com/pypi/simple
Check "wandb 安装失败。"
Write-Output "安装 bitsandbytes..."
cp .\bitsandbytes_windows\*.dll ..\venv\Lib\site-packages\bitsandbytes\
cp .\bitsandbytes_windows\cextension.py ..\venv\Lib\site-packages\bitsandbytes\cextension.py
cp .\bitsandbytes_windows\main.py ..\venv\Lib\site-packages\bitsandbytes\cuda_setup\main.py
Write-Output "安装完毕"
Read-Host | Out-Null ;
#!/usr/bin/bash
script_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
create_venv=true
while [ -n "$1" ]; do
case "$1" in
--disable-venv)
create_venv=false
shift
;;
*)
shift
;;
esac
done
if $create_venv; then
echo "Creating python venv..."
python3 -m venv venv
source "$script_dir/venv/bin/activate"
echo "active venv"
fi
echo "Installing torch & xformers..."
cuda_version=$(nvcc --version | grep 'release' | sed -n -e 's/^.*release \([0-9]\+\.[0-9]\+\),.*$/\1/p')
cuda_major_version=$(echo "$cuda_version" | awk -F'.' '{print $1}')
cuda_minor_version=$(echo "$cuda_version" | awk -F'.' '{print $2}')
echo "Cuda Version:$cuda_version"
if (( cuda_major_version >= 12 )) || (( cuda_major_version == 11 && cuda_minor_version >= 8 )); then
echo "install torch 2.0.0+cu118"
pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
pip install xformers==0.0.19
elif (( cuda_major_version == 11 && cuda_minor_version == 6 )); then
echo "install torch 1.12.1+cu116"
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116
pip install --upgrade git+https://github.com/facebookresearch/xformers.git@0bad001ddd56c080524d37c84ff58d9cd030ebfd
pip install triton==2.0.0.dev20221202
else
echo "Unsupported cuda version:$cuda_version"
exit 1
fi
echo "Installing deps..."
cd "$script_dir/sd-scripts" || exit
pip install --upgrade -r requirements.txt
pip install --upgrade lion-pytorch lycoris-lora dadaptation fastapi uvicorn wandb
cd "$script_dir" || exit
echo "Install completed"
$Env:HF_HOME = "huggingface"
if (!(Test-Path -Path "venv")) {
Write-Output "Creating venv for python..."
python -m venv venv
}
.\venv\Scripts\activate
Write-Output "Installing deps..."
Set-Location .\sd-scripts
pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
pip install --upgrade -r requirements.txt
pip install --upgrade xformers==0.0.19
Write-Output "Installing bitsandbytes for windows..."
cp .\bitsandbytes_windows\*.dll ..\venv\Lib\site-packages\bitsandbytes\
cp .\bitsandbytes_windows\cextension.py ..\venv\Lib\site-packages\bitsandbytes\cextension.py
cp .\bitsandbytes_windows\main.py ..\venv\Lib\site-packages\bitsandbytes\cuda_setup\main.py
pip install --upgrade lion-pytorch dadaptation lycoris-lora fastapi uvicorn wandb
Write-Output "Install completed"
Read-Host | Out-Null ;
\ No newline at end of file
# LoRA interrogate script by @bdsqlsz
$v2 = 0 # load Stable Diffusion v2.x model / Stable Diffusion 2.x模型读取
$sd_model = "./sd-models/sd_model.safetensors" # Stable Diffusion model to load: ckpt or safetensors file | 读取的基础SD模型, 保存格式 cpkt 或 safetensors
$model = "./output/LoRA.safetensors" # LoRA model to interrogate: ckpt or safetensors file | 需要调查关键字的LORA模型, 保存格式 cpkt 或 safetensors
$batch_size = 64 # batch size for processing with Text Encoder | 使用 Text Encoder 处理时的批量大小,默认16,推荐64/128
$clip_skip = 1 # use output of nth layer from back of text encoder (n>=1) | 使用文本编码器倒数第 n 层的输出,n 可以是大于等于 1 的整数
# Activate python venv
.\venv\Scripts\activate
$Env:HF_HOME = "huggingface"
$ext_args = [System.Collections.ArrayList]::new()
if ($v2) {
[void]$ext_args.Add("--v2")
}
# run interrogate
accelerate launch --num_cpu_threads_per_process=8 "./sd-scripts/networks/lora_interrogator.py" `
--sd_model=$sd_model `
--model=$model `
--batch_size=$batch_size `
--clip_skip=$clip_skip `
$ext_args
Write-Output "Interrogate finished"
Read-Host | Out-Null ;
import json
import os
import subprocess
import sys
from datetime import datetime
from threading import Lock
import starlette.responses as starlette_responses
from fastapi import BackgroundTasks, FastAPI, Request
from fastapi.responses import FileResponse
from fastapi.staticfiles import StaticFiles
import toml
from mikazuki.models import TaggerInterrogateRequest
from mikazuki.tagger.interrogator import available_interrogators, on_interrogate
app = FastAPI()
lock = Lock()
# fix mimetype error in some fucking systems
_origin_guess_type = starlette_responses.guess_type
def _hooked_guess_type(*args, **kwargs):
url = args[0]
r = _origin_guess_type(*args, **kwargs)
if url.endswith(".js"):
r = ("application/javascript", None)
elif url.endswith(".css"):
r = ("text/css", None)
return r
starlette_responses.guess_type = _hooked_guess_type
def run_train(toml_path: str):
print(f"Training started with config file / 训练开始,使用配置文件: {toml_path}")
args = [
sys.executable, "-m", "accelerate.commands.launch", "--num_cpu_threads_per_process", "8",
"./sd-scripts/train_network.py",
"--config_file", toml_path,
]
try:
result = subprocess.run(args, env=os.environ)
if result.returncode != 0:
print(f"Training failed / 训练失败")
else:
print(f"Training finished / 训练完成")
except Exception as e:
print(f"An error occurred when training / 创建训练进程时出现致命错误: {e}")
finally:
lock.release()
@app.middleware("http")
async def add_cache_control_header(request, call_next):
response = await call_next(request)
response.headers["Cache-Control"] = "max-age=0"
return response
@app.post("/api/run")
async def create_toml_file(request: Request, background_tasks: BackgroundTasks):
acquired = lock.acquire(blocking=False)
if not acquired:
print("Training is already running / 已有正在进行的训练")
return {"status": "fail", "detail": "Training is already running"}
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S")
toml_file = os.path.join(os.getcwd(), f"toml", "autosave", f"{timestamp}.toml")
toml_data = await request.body()
j = json.loads(toml_data.decode("utf-8"))
with open(toml_file, "w") as f:
f.write(toml.dumps(j))
background_tasks.add_task(run_train, toml_file)
return {"status": "success"}
@app.post("/api/interrogate")
async def run_interrogate(req: TaggerInterrogateRequest, background_tasks: BackgroundTasks):
interrogator = available_interrogators.get(req.interrogator_model, available_interrogators["wd14-convnextv2-v2"])
background_tasks.add_task(on_interrogate,
image=None,
batch_input_glob=req.path,
batch_input_recursive=False,
batch_output_dir="",
batch_output_filename_format="[name].[output_extension]",
batch_output_action_on_conflict=req.batch_output_action_on_conflict,
batch_remove_duplicated_tag=True,
batch_output_save_json=False,
interrogator=interrogator,
threshold=req.threshold,
additional_tags=req.additional_tags,
exclude_tags=req.exclude_tags,
sort_by_alphabetical_order=False,
add_confident_as_weight=False,
replace_underscore=req.replace_underscore,
replace_underscore_excludes=req.replace_underscore_excludes,
escape_tag=req.escape_tag,
unload_model_after_running=True
)
return {"status": "success"}
@app.get("/")
async def index():
return FileResponse("./frontend/dist/index.html")
app.mount("/", StaticFiles(directory="frontend/dist"), name="static")
from pydantic import BaseModel, Field
class TaggerInterrogateRequest(BaseModel):
path: str
interrogator_model: str = Field(
default="wd14-convnextv2-v2"
)
threshold: float = Field(
default=0.35,
ge=0,
le=1
)
additional_tags: str = ""
exclude_tags: str = ""
escape_tag: bool = True
batch_output_action_on_conflict: str = "ignore"
replace_underscore: bool = True
replace_underscore_excludes: str = Field(
default="0_0, (o)_(o), +_+, +_-, ._., <o>_<o>, <|>_<|>, =_=, >_<, 3_3, 6_9, >_o, @_@, ^_^, o_o, u_u, x_x, |_|, ||_||"
)
# DanBooru IMage Utility functions
import cv2
import numpy as np
from PIL import Image
def smart_imread(img, flag=cv2.IMREAD_UNCHANGED):
if img.endswith(".gif"):
img = Image.open(img)
img = img.convert("RGB")
img = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
else:
img = cv2.imread(img, flag)
return img
def smart_24bit(img):
if img.dtype is np.dtype(np.uint16):
img = (img / 257).astype(np.uint8)
if len(img.shape) == 2:
img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
elif img.shape[2] == 4:
trans_mask = img[:, :, 3] == 0
img[trans_mask] = [255, 255, 255, 255]
img = cv2.cvtColor(img, cv2.COLOR_BGRA2BGR)
return img
def make_square(img, target_size):
old_size = img.shape[:2]
desired_size = max(old_size)
desired_size = max(desired_size, target_size)
delta_w = desired_size - old_size[1]
delta_h = desired_size - old_size[0]
top, bottom = delta_h // 2, delta_h - (delta_h // 2)
left, right = delta_w // 2, delta_w - (delta_w // 2)
color = [255, 255, 255]
new_im = cv2.copyMakeBorder(
img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color
)
return new_im
def smart_resize(img, size):
# Assumes the image has already gone through make_square
if img.shape[0] > size:
img = cv2.resize(img, (size, size), interpolation=cv2.INTER_AREA)
elif img.shape[0] < size:
img = cv2.resize(img, (size, size), interpolation=cv2.INTER_CUBIC)
return img
import re
import hashlib
from typing import Dict, Callable, NamedTuple
from pathlib import Path
class Info(NamedTuple):
path: Path
output_ext: str
def hash(i: Info, algo='sha1') -> str:
try:
hash = hashlib.new(algo)
except ImportError:
raise ValueError(f"'{algo}' is invalid hash algorithm")
# TODO: is okay to hash large image?
with open(i.path, 'rb') as file:
hash.update(file.read())
return hash.hexdigest()
pattern = re.compile(r'\[([\w:]+)\]')
# all function must returns string or raise TypeError or ValueError
# other errors will cause the extension error
available_formats: Dict[str, Callable] = {
'name': lambda i: i.path.stem,
'extension': lambda i: i.path.suffix[1:],
'hash': hash,
'output_extension': lambda i: i.output_ext
}
def format(match: re.Match, info: Info) -> str:
matches = match[1].split(':')
name, args = matches[0], matches[1:]
if name not in available_formats:
return match[0]
return available_formats[name](info, *args)
This diff is collapsed.
import os
import sys
import subprocess
import importlib.util
python_bin = sys.executable
def is_installed(package):
try:
spec = importlib.util.find_spec(package)
except ModuleNotFoundError:
return False
return spec is not None
def run(command, desc=None, errdesc=None, custom_env=None, live=False):
if desc is not None:
print(desc)
if live:
result = subprocess.run(command, shell=True, env=os.environ if custom_env is None else custom_env)
if result.returncode != 0:
raise RuntimeError(f"""{errdesc or 'Error running command'}.
Command: {command}
Error code: {result.returncode}""")
return ""
result = subprocess.run(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, env=os.environ if custom_env is None else custom_env)
if result.returncode != 0:
message = f"""{errdesc or 'Error running command'}.
Command: {command}
Error code: {result.returncode}
stdout: {result.stdout.decode(encoding="utf8", errors="ignore") if len(result.stdout) > 0 else '<empty>'}
stderr: {result.stderr.decode(encoding="utf8", errors="ignore") if len(result.stderr) > 0 else '<empty>'}
"""
raise RuntimeError(message)
return result.stdout.decode(encoding="utf8", errors="ignore")
def run_pip(command, desc=None, live=False):
return run(f'"{python_bin}" -m pip {command}', desc=f"Installing {desc}", errdesc=f"Couldn't install {desc}", live=live)
# LoRA resize script by @bdsqlsz
$save_precision = "fp16" # precision in saving, default float | 保存精度, 可选 float、fp16、bf16, 默认 float
$new_rank = 4 # dim rank of output LoRA | dim rank等级, 默认 4
$model = "./output/lora_name.safetensors" # original LoRA model path need to resize, save as cpkt or safetensors | 需要调整大小的模型路径, 保存格式 cpkt 或 safetensors
$save_to = "./output/lora_name_new.safetensors" # output LoRA model path, save as ckpt or safetensors | 输出路径, 保存格式 cpkt 或 safetensors
$device = "cuda" # device to use, cuda for GPU | 使用 GPU跑, 默认 CPU
$verbose = 1 # display verbose resizing information | rank变更时, 显示详细信息
$dynamic_method = "" # Specify dynamic resizing method, --new_rank is used as a hard limit for max rank | 动态调节大小,可选"sv_ratio", "sv_fro", "sv_cumulative",默认无
$dynamic_param = "" # Specify target for dynamic reduction | 动态参数,sv_ratio模式推荐1~2, sv_cumulative模式0~1, sv_fro模式0~1, 比sv_cumulative要高
# Activate python venv
.\venv\Scripts\activate
$Env:HF_HOME = "huggingface"
$ext_args = [System.Collections.ArrayList]::new()
if ($verbose) {
[void]$ext_args.Add("--verbose")
}
if ($dynamic_method) {
[void]$ext_args.Add("--dynamic_method=" + $dynamic_method)
}
if ($dynamic_param) {
[void]$ext_args.Add("--dynamic_param=" + $dynamic_param)
}
# run resize
accelerate launch --num_cpu_threads_per_process=8 "./sd-scripts/networks/resize_lora.py" `
--save_precision=$save_precision `
--new_rank=$new_rank `
--model=$model `
--save_to=$save_to `
--device=$device `
$ext_args
Write-Output "Resize finished"
Read-Host | Out-Null ;
.\venv\Scripts\activate
$Env:HF_HOME = "huggingface"
$Env:PYTHONUTF8 = "1"
python gui.py
\ No newline at end of file
#!/bin/bash
export HF_HOME=huggingface
export PYTHONUTF8=1
python gui.py "$@"
# LoRA svd_merge script by @bdsqlsz
$save_precision = "fp16" # precision in saving, default float | 保存精度, 可选 float、fp16、bf16, 默认 和源文件相同
$precision = "float" # precision in merging (float is recommended) | 合并时计算精度, 可选 float、fp16、bf16, 推荐float
$new_rank = 4 # dim rank of output LoRA | dim rank等级, 默认 4
$models = "./output/modelA.safetensors ./output/modelB.safetensors" # original LoRA model path need to resize, save as cpkt or safetensors | 需要合并的模型路径, 保存格式 cpkt 或 safetensors,多个用空格隔开
$ratios = "1.0 -1.0" # ratios for each model / LoRA模型合并比例,数量等于模型数量,多个用空格隔开
$save_to = "./output/lora_name_new.safetensors" # output LoRA model path, save as ckpt or safetensors | 输出路径, 保存格式 cpkt 或 safetensors
$device = "cuda" # device to use, cuda for GPU | 使用 GPU跑, 默认 CPU
$new_conv_rank = 0 # Specify rank of output LoRA for Conv2d 3x3, None for same as new_rank | Conv2d 3x3输出,没有默认同new_rank
# Activate python venv
.\venv\Scripts\activate
$Env:HF_HOME = "huggingface"
$Env:XFORMERS_FORCE_DISABLE_TRITON = "1"
$ext_args = [System.Collections.ArrayList]::new()
[void]$ext_args.Add("--models")
foreach ($model in $models.Split(" ")) {
[void]$ext_args.Add($model)
}
[void]$ext_args.Add("--ratios")
foreach ($ratio in $ratios.Split(" ")) {
[void]$ext_args.Add([float]$ratio)
}
if ($new_conv_rank) {
[void]$ext_args.Add("--new_conv_rank=" + $new_conv_rank)
}
# run svd_merge
accelerate launch --num_cpu_threads_per_process=8 "./sd-scripts/networks/svd_merge_lora.py" `
--save_precision=$save_precision `
--precision=$precision `
--new_rank=$new_rank `
--save_to=$save_to `
--device=$device `
$ext_args
Write-Output "SVD Merge finished"
Read-Host | Out-Null ;
$Env:TF_CPP_MIN_LOG_LEVEL = "3"
.\venv\Scripts\activate
tensorboard --logdir=logs
\ No newline at end of file
[model]
v2 = false
v_parameterization = false
pretrained_model_name_or_path = "./sd-models/model.ckpt"
[dataset]
train_data_dir = "./train/input"
reg_data_dir = ""
prior_loss_weight = 1
cache_latents = true
shuffle_caption = true
enable_bucket = true
[additional_network]
network_dim = 32
network_alpha = 16
network_train_unet_only = false
network_train_text_encoder_only = false
network_module = "networks.lora"
network_args = []
[optimizer]
unet_lr = 1e-4
text_encoder_lr = 1e-5
optimizer_type = "AdamW8bit"
lr_scheduler = "cosine_with_restarts"
lr_warmup_steps = 0
lr_restart_cycles = 1
[training]
resolution = "512,512"
train_batch_size = 1
max_train_epochs = 10
noise_offset = 0.0
keep_tokens = 0
xformers = true
lowram = false
clip_skip = 2
mixed_precision = "fp16"
save_precision = "fp16"
[sample_prompt]
sample_sampler = "euler_a"
sample_every_n_epochs = 1
[saving]
output_name = "output_name"
save_every_n_epochs = 1
save_n_epoch_ratio = 0
save_last_n_epochs = 499
save_state = false
save_model_as = "safetensors"
output_dir = "./output"
logging_dir = "./logs"
log_prefix = "output_name"
[others]
min_bucket_reso = 256
max_bucket_reso = 1024
caption_extension = ".txt"
max_token_length = 225
seed = 1337
[model_arguments]
v2 = false
v_parameterization = false
pretrained_model_name_or_path = "./sd-models/model.ckpt"
[dataset_arguments]
train_data_dir = "./train/aki"
reg_data_dir = ""
resolution = "512,512"
prior_loss_weight = 1
[additional_network_arguments]
network_dim = 32
network_alpha = 16
network_train_unet_only = false
network_train_text_encoder_only = false
network_module = "networks.lora"
network_args = []
[optimizer_arguments]
unet_lr = 1e-4
text_encoder_lr = 1e-5
optimizer_type = "AdamW8bit"
lr_scheduler = "cosine_with_restarts"
lr_warmup_steps = 0
lr_restart_cycles = 1
[training_arguments]
train_batch_size = 1
noise_offset = 0.0
keep_tokens = 0
min_bucket_reso = 256
max_bucket_reso = 1024
caption_extension = ".txt"
max_token_length = 225
seed = 1337
xformers = true
lowram = false
max_train_epochs = 10
resolution = "512,512"
clip_skip = 2
mixed_precision = "fp16"
[sample_prompt_arguments]
sample_sampler = "euler_a"
sample_every_n_epochs = 5
[saving_arguments]
output_name = "output_name"
save_every_n_epochs = 1
save_state = false
save_model_as = "safetensors"
output_dir = "./output"
logging_dir = "./logs"
log_prefix = ""
save_precision = "fp16"
[others]
cache_latents = true
shuffle_caption = true
enable_bucket = true
\ No newline at end of file
(masterpiece, best quality:1.2), 1girl, solo, --n lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts,signature, watermark, username, blurry, --w 512 --h 768 --l 7 --s 24 --d 1337
\ No newline at end of file
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"# Train data path | 设置训练用模型、图片\n",
"pretrained_model = \"./sd-models/model.ckpt\" # base model path | 底模路径\n",
"train_data_dir = \"./train/aki\" # train dataset path | 训练数据集路径\n",
"\n",
"# Train related params | 训练相关参数\n",
"resolution = \"512,512\" # image resolution w,h. 图片分辨率,宽,高。支持非正方形,但必须是 64 倍数。\n",
"batch_size = 1 # batch size\n",
"max_train_epoches = 10 # max train epoches | 最大训练 epoch\n",
"save_every_n_epochs = 2 # save every n epochs | 每 N 个 epoch 保存一次\n",
"network_dim = 32 # network dim | 常用 4~128,不是越大越好\n",
"network_alpha= 32 # network alpha | 常用与 network_dim 相同的值或者采用较小的值,如 network_dim的一半 防止下溢。默认值为 1,使用较小的 alpha 需要提升学习率。\n",
"clip_skip = 2 # clip skip | 玄学 一般用 2\n",
"train_unet_only = 0 # train U-Net only | 仅训练 U-Net,开启这个会牺牲效果大幅减少显存使用。6G显存可以开启\n",
"train_text_encoder_only = 0 # train Text Encoder only | 仅训练 文本编码器\n",
"\n",
"# Learning rate | 学习率\n",
"lr = \"1e-4\"\n",
"unet_lr = \"1e-4\"\n",
"text_encoder_lr = \"1e-5\"\n",
"lr_scheduler = \"cosine_with_restarts\" # \"linear\", \"cosine\", \"cosine_with_restarts\", \"polynomial\", \"constant\", \"constant_with_warmup\"\n",
"\n",
"# Output settings | 输出设置\n",
"output_name = \"aki\" # output model name | 模型保存名称\n",
"save_model_as = \"safetensors\" # model save ext | 模型保存格式 ckpt, pt, safetensors"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"!accelerate launch --num_cpu_threads_per_process=8 \"./sd-scripts/train_network.py\" \\\n",
" --enable_bucket \\\n",
" --pretrained_model_name_or_path=$pretrained_model \\\n",
" --train_data_dir=$train_data_dir \\\n",
" --output_dir=\"./output\" \\\n",
" --logging_dir=\"./logs\" \\\n",
" --resolution=$resolution \\\n",
" --network_module=networks.lora \\\n",
" --max_train_epochs=$max_train_epoches \\\n",
" --learning_rate=$lr \\\n",
" --unet_lr=$unet_lr \\\n",
" --text_encoder_lr=$text_encoder_lr \\\n",
" --network_dim=$network_dim \\\n",
" --network_alpha=$network_alpha \\\n",
" --output_name=$output_name \\\n",
" --lr_scheduler=$lr_scheduler \\\n",
" --train_batch_size=$batch_size \\\n",
" --save_every_n_epochs=$save_every_n_epochs \\\n",
" --mixed_precision=\"fp16\" \\\n",
" --save_precision=\"fp16\" \\\n",
" --seed=\"1337\" \\\n",
" --cache_latents \\\n",
" --clip_skip=$clip_skip \\\n",
" --prior_loss_weight=1 \\\n",
" --max_token_length=225 \\\n",
" --caption_extension=\".txt\" \\\n",
" --save_model_as=$save_model_as \\\n",
" --xformers --shuffle_caption --use_8bit_adam"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.10.7 (tags/v3.10.7:6cc6b13, Sep 5 2022, 14:08:36) [MSC v.1933 64 bit (AMD64)]"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "675b13e958f0d0236d13cdfe08a1df3882cae564fa23a2e7e5eb1f2c6c632b02"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}
\ No newline at end of file
# LoRA train script by @Akegarasu
# Train data path | 设置训练用模型、图片
$pretrained_model = "./sd-models/model.ckpt" # base model path | 底模路径
$is_v2_model = 0 # SD2.0 model | SD2.0模型 2.0模型下 clip_skip 默认无效
$parameterization = 0 # parameterization | 参数化 本参数需要和 V2 参数同步使用 实验性功能
$train_data_dir = "./train/aki" # train dataset path | 训练数据集路径
$reg_data_dir = "" # directory for regularization images | 正则化数据集路径,默认不使用正则化图像。
# Network settings | 网络设置
$network_module = "networks.lora" # 在这里将会设置训练的网络种类,默认为 networks.lora 也就是 LoRA 训练。如果你想训练 LyCORIS(LoCon、LoHa) 等,则修改这个值为 lycoris.kohya
$network_weights = "" # pretrained weights for LoRA network | 若需要从已有的 LoRA 模型上继续训练,请填写 LoRA 模型路径。
$network_dim = 32 # network dim | 常用 4~128,不是越大越好
$network_alpha = 32 # network alpha | 常用与 network_dim 相同的值或者采用较小的值,如 network_dim的一半 防止下溢。默认值为 1,使用较小的 alpha 需要提升学习率。
# Train related params | 训练相关参数
$resolution = "512,512" # image resolution w,h. 图片分辨率,宽,高。支持非正方形,但必须是 64 倍数。
$batch_size = 1 # batch size
$max_train_epoches = 10 # max train epoches | 最大训练 epoch
$save_every_n_epochs = 2 # save every n epochs | 每 N 个 epoch 保存一次
$train_unet_only = 0 # train U-Net only | 仅训练 U-Net,开启这个会牺牲效果大幅减少显存使用。6G显存可以开启
$train_text_encoder_only = 0 # train Text Encoder only | 仅训练 文本编码器
$stop_text_encoder_training = 0 # stop text encoder training | 在第N步时停止训练文本编码器
$noise_offset = 0 # noise offset | 在训练中添加噪声偏移来改良生成非常暗或者非常亮的图像,如果启用,推荐参数为 0.1
$keep_tokens = 0 # keep heading N tokens when shuffling caption tokens | 在随机打乱 tokens 时,保留前 N 个不变。
$min_snr_gamma = 0 # minimum signal-to-noise ratio (SNR) value for gamma-ray | 伽马射线事件的最小信噪比(SNR)值 默认为 0
# Learning rate | 学习率
$lr = "1e-4"
$unet_lr = "1e-4"
$text_encoder_lr = "1e-5"
$lr_scheduler = "cosine_with_restarts" # "linear", "cosine", "cosine_with_restarts", "polynomial", "constant", "constant_with_warmup"
$lr_warmup_steps = 0 # warmup steps | 学习率预热步数,lr_scheduler 为 constant 或 adafactor 时该值需要设为0。
$lr_restart_cycles = 1 # cosine_with_restarts restart cycles | 余弦退火重启次数,仅在 lr_scheduler 为 cosine_with_restarts 时起效。
# Output settings | 输出设置
$output_name = "aki" # output model name | 模型保存名称
$save_model_as = "safetensors" # model save ext | 模型保存格式 ckpt, pt, safetensors
# Resume training state | 恢复训练设置
$save_state = 0 # save training state | 保存训练状态 名称类似于 <output_name>-??????-state ?????? 表示 epoch 数
$resume = "" # resume from state | 从某个状态文件夹中恢复训练 需配合上方参数同时使用 由于规范文件限制 epoch 数和全局步数不会保存 即使恢复时它们也从 1 开始 与 network_weights 的具体实现操作并不一致
# 其他设置
$min_bucket_reso = 256 # arb min resolution | arb 最小分辨率
$max_bucket_reso = 1024 # arb max resolution | arb 最大分辨率
$persistent_data_loader_workers = 0 # persistent dataloader workers | 容易爆内存,保留加载训练集的worker,减少每个 epoch 之间的停顿
$clip_skip = 2 # clip skip | 玄学 一般用 2
$multi_gpu = 0 # multi gpu | 多显卡训练 该参数仅限在显卡数 >= 2 使用
$lowram = 0 # lowram mode | 低内存模式 该模式下会将 U-net 文本编码器 VAE 转移到 GPU 显存中 启用该模式可能会对显存有一定影响
# 优化器设置
$optimizer_type = "AdamW8bit" # Optimizer type | 优化器类型 默认为 AdamW8bit,可选:AdamW AdamW8bit Lion SGDNesterov SGDNesterov8bit DAdaptation AdaFactor
# LyCORIS 训练设置
$algo = "lora" # LyCORIS network algo | LyCORIS 网络算法 可选 lora、loha、lokr、ia3、dylora。lora即为locon
$conv_dim = 4 # conv dim | 类似于 network_dim,推荐为 4
$conv_alpha = 4 # conv alpha | 类似于 network_alpha,可以采用与 conv_dim 一致或者更小的值
$dropout = "0" # dropout | dropout 概率, 0 为不使用 dropout, 越大则 dropout 越多,推荐 0~0.5, LoHa/LoKr/(IA)^3暂时不支持
# 远程记录设置
$use_wandb = 0 # enable wandb logging | 启用wandb远程记录功能
$wandb_api_key = "" # wandb api key | API,通过https://wandb.ai/authorize获取
$log_tracker_name = "" # wandb log tracker name | wandb项目名称,留空则为"network_train"
# ============= DO NOT MODIFY CONTENTS BELOW | 请勿修改下方内容 =====================
# Activate python venv
.\venv\Scripts\activate
$Env:HF_HOME = "huggingface"
$Env:XFORMERS_FORCE_DISABLE_TRITON = "1"
$ext_args = [System.Collections.ArrayList]::new()
$launch_args = [System.Collections.ArrayList]::new()
if ($multi_gpu) {
[void]$launch_args.Add("--multi_gpu")
}
if ($lowram) {
[void]$ext_args.Add("--lowram")
}
if ($is_v2_model) {
[void]$ext_args.Add("--v2")
}
else {
[void]$ext_args.Add("--clip_skip=$clip_skip")
}
if ($parameterization) {
[void]$ext_args.Add("--v_parameterization")
}
if ($train_unet_only) {
[void]$ext_args.Add("--network_train_unet_only")
}
if ($train_text_encoder_only) {
[void]$ext_args.Add("--network_train_text_encoder_only")
}
if ($network_weights) {
[void]$ext_args.Add("--network_weights=" + $network_weights)
}
if ($reg_data_dir) {
[void]$ext_args.Add("--reg_data_dir=" + $reg_data_dir)
}
if ($optimizer_type) {
[void]$ext_args.Add("--optimizer_type=" + $optimizer_type)
}
if ($optimizer_type -eq "DAdaptation") {
[void]$ext_args.Add("--optimizer_args")
[void]$ext_args.Add("decouple=True")
}
if ($network_module -eq "lycoris.kohya") {
[void]$ext_args.Add("--network_args")
[void]$ext_args.Add("conv_dim=$conv_dim")
[void]$ext_args.Add("conv_alpha=$conv_alpha")
[void]$ext_args.Add("algo=$algo")
[void]$ext_args.Add("dropout=$dropout")
}
if ($noise_offset -ne 0) {
[void]$ext_args.Add("--noise_offset=$noise_offset")
}
if ($stop_text_encoder_training -ne 0) {
[void]$ext_args.Add("--stop_text_encoder_training=$stop_text_encoder_training")
}
if ($save_state -eq 1) {
[void]$ext_args.Add("--save_state")
}
if ($resume) {
[void]$ext_args.Add("--resume=" + $resume)
}
if ($min_snr_gamma -ne 0) {
[void]$ext_args.Add("--min_snr_gamma=$min_snr_gamma")
}
if ($persistent_data_loader_workers) {
[void]$ext_args.Add("--persistent_data_loader_workers")
}
if ($use_wandb -eq 1) {
[void]$ext_args.Add("--log_with=all")
if ($wandb_api_key) {
[void]$ext_args.Add("--wandb_api_key=" + $wandb_api_key)
}
if ($log_tracker_name) {
[void]$ext_args.Add("--log_tracker_name=" + $log_tracker_name)
}
}
else {
[void]$ext_args.Add("--log_with=tensorboard")
}
# run train
python -m accelerate.commands.launch $launch_args --num_cpu_threads_per_process=8 "./sd-scripts/train_network.py" `
--enable_bucket `
--pretrained_model_name_or_path=$pretrained_model `
--train_data_dir=$train_data_dir `
--output_dir="./output" `
--logging_dir="./logs" `
--log_prefix=$output_name `
--resolution=$resolution `
--network_module=$network_module `
--max_train_epochs=$max_train_epoches `
--learning_rate=$lr `
--unet_lr=$unet_lr `
--text_encoder_lr=$text_encoder_lr `
--lr_scheduler=$lr_scheduler `
--lr_warmup_steps=$lr_warmup_steps `
--lr_scheduler_num_cycles=$lr_restart_cycles `
--network_dim=$network_dim `
--network_alpha=$network_alpha `
--output_name=$output_name `
--train_batch_size=$batch_size `
--save_every_n_epochs=$save_every_n_epochs `
--mixed_precision="fp16" `
--save_precision="fp16" `
--seed="1337" `
--cache_latents `
--prior_loss_weight=1 `
--max_token_length=225 `
--caption_extension=".txt" `
--save_model_as=$save_model_as `
--min_bucket_reso=$min_bucket_reso `
--max_bucket_reso=$max_bucket_reso `
--keep_tokens=$keep_tokens `
--xformers --shuffle_caption $ext_args
Write-Output "Train finished"
Read-Host | Out-Null ;
#!/bin/bash
# LoRA train script by @Akegarasu
# Train data path | 设置训练用模型、图片
pretrained_model="./sd-models/model.ckpt" # base model path | 底模路径
is_v2_model=0 # SD2.0 model | SD2.0模型 2.0模型下 clip_skip 默认无效
parameterization=0 # parameterization | 参数化 本参数需要和 V2 参数同步使用 实验性功能
train_data_dir="./train/aki" # train dataset path | 训练数据集路径
reg_data_dir="" # directory for regularization images | 正则化数据集路径,默认不使用正则化图像。
# Network settings | 网络设置
network_module="networks.lora" # 在这里将会设置训练的网络种类,默认为 networks.lora 也就是 LoRA 训练。如果你想训练 LyCORIS(LoCon、LoHa) 等,则修改这个值为 lycoris.kohya
network_weights="" # pretrained weights for LoRA network | 若需要从已有的 LoRA 模型上继续训练,请填写 LoRA 模型路径。
network_dim=32 # network dim | 常用 4~128,不是越大越好
network_alpha=32 # network alpha | 常用与 network_dim 相同的值或者采用较小的值,如 network_dim的一半 防止下溢。默认值为 1,使用较小的 alpha 需要提升学习率。
# Train related params | 训练相关参数
resolution="512,512" # image resolution w,h. 图片分辨率,宽,高。支持非正方形,但必须是 64 倍数。
batch_size=1 # batch size
max_train_epoches=10 # max train epoches | 最大训练 epoch
save_every_n_epochs=2 # save every n epochs | 每 N 个 epoch 保存一次
train_unet_only=0 # train U-Net only | 仅训练 U-Net,开启这个会牺牲效果大幅减少显存使用。6G显存可以开启
train_text_encoder_only=0 # train Text Encoder only | 仅训练 文本编码器
stop_text_encoder_training=0 # stop text encoder training | 在第N步时停止训练文本编码器
noise_offset="0" # noise offset | 在训练中添加噪声偏移来改良生成非常暗或者非常亮的图像,如果启用,推荐参数为0.1
keep_tokens=0 # keep heading N tokens when shuffling caption tokens | 在随机打乱 tokens 时,保留前 N 个不变。
min_snr_gamma=0 # minimum signal-to-noise ratio (SNR) value for gamma-ray | 伽马射线事件的最小信噪比(SNR)值 默认为 0
# Learning rate | 学习率
lr="1e-4"
unet_lr="1e-4"
text_encoder_lr="1e-5"
lr_scheduler="cosine_with_restarts" # "linear", "cosine", "cosine_with_restarts", "polynomial", "constant", "constant_with_warmup", "adafactor"
lr_warmup_steps=0 # warmup steps | 学习率预热步数,lr_scheduler 为 constant 或 adafactor 时该值需要设为0。
lr_restart_cycles=1 # cosine_with_restarts restart cycles | 余弦退火重启次数,仅在 lr_scheduler 为 cosine_with_restarts 时起效。
# Output settings | 输出设置
output_name="aki" # output model name | 模型保存名称
save_model_as="safetensors" # model save ext | 模型保存格式 ckpt, pt, safetensors
# Resume training state | 恢复训练设置
save_state=0 # save state | 保存训练状态 名称类似于 <output_name>-??????-state ?????? 表示 epoch 数
resume="" # resume from state | 从某个状态文件夹中恢复训练 需配合上方参数同时使用 由于规范文件限制 epoch 数和全局步数不会保存 即使恢复时它们也从 1 开始 与 network_weights 的具体实现操作并不一致
# 其他设置
min_bucket_reso=256 # arb min resolution | arb 最小分辨率
max_bucket_reso=1024 # arb max resolution | arb 最大分辨率
persistent_data_loader_workers=0 # persistent dataloader workers | 容易爆内存,保留加载训练集的worker,减少每个 epoch 之间的停顿
clip_skip=2 # clip skip | 玄学 一般用 2
# 优化器设置
optimizer_type="AdamW8bit" # Optimizer type | 优化器类型 默认为 AdamW8bit,可选:AdamW AdamW8bit Lion SGDNesterov SGDNesterov8bit DAdaptation AdaFactor
# LyCORIS 训练设置
algo="lora" # LyCORIS network algo | LyCORIS 网络算法 可选 lora、loha、lokr、ia3、dylora。lora即为locon
conv_dim=4 # conv dim | 类似于 network_dim,推荐为 4
conv_alpha=4 # conv alpha | 类似于 network_alpha,可以采用与 conv_dim 一致或者更小的值
dropout="0" # dropout | dropout 概率, 0 为不使用 dropout, 越大则 dropout 越多,推荐 0~0.5, LoHa/LoKr/(IA)^3暂时不支持
# 远程记录设置
use_wandb=0 # use_wandb | 启用wandb远程记录功能
wandb_api_key="" # wandb_api_key | API,通过https://wandb.ai/authorize获取
log_tracker_name="" # log_tracker_name | wandb项目名称,留空则为"network_train"
# ============= DO NOT MODIFY CONTENTS BELOW | 请勿修改下方内容 =====================
export HF_HOME="huggingface"
export TF_CPP_MIN_LOG_LEVEL=3
extArgs=()
launchArgs=()
if [[ $multi_gpu == 1 ]]; then launchArgs+=("--multi_gpu"); fi
if [[ $is_v2_model == 1 ]]; then
extArgs+=("--v2");
else
extArgs+=("--clip_skip $clip_skip");
fi
if [[ $parameterization == 1 ]]; then extArgs+=("--v_parameterization"); fi
if [[ $train_unet_only == 1 ]]; then extArgs+=("--network_train_unet_only"); fi
if [[ $train_text_encoder_only == 1 ]]; then extArgs+=("--network_train_text_encoder_only"); fi
if [[ $network_weights ]]; then extArgs+=("--network_weights $network_weights"); fi
if [[ $reg_data_dir ]]; then extArgs+=("--reg_data_dir $reg_data_dir"); fi
if [[ $optimizer_type ]]; then extArgs+=("--optimizer_type $optimizer_type"); fi
if [[ $optimizer_type == "DAdaptation" ]]; then extArgs+=("--optimizer_args decouple=True"); fi
if [[ $save_state == 1 ]]; then extArgs+=("--save_state"); fi
if [[ $resume ]]; then extArgs+=("--resume $resume"); fi
if [[ $persistent_data_loader_workers == 1 ]]; then extArgs+=("--persistent_data_loader_workers"); fi
if [[ $network_module == "lycoris.kohya" ]]; then
extArgs+=("--network_args conv_dim=$conv_dim conv_alpha=$conv_alpha algo=$algo dropout=$dropout")
fi
if [[ $stop_text_encoder_training -ne 0 ]]; then extArgs+=("--stop_text_encoder_training $stop_text_encoder_training"); fi
if [[ $noise_offset != "0" ]]; then extArgs+=("--noise_offset $noise_offset"); fi
if [[ $min_snr_gamma -ne 0 ]]; then extArgs+=("--min_snr_gamma $min_snr_gamma"); fi
if [[ $use_wandb == 1 ]]; then
extArgs+=("--log_with=all")
else
extArgs+=("--log_with=tensorboard")
fi
if [[ $wandb_api_key ]]; then extArgs+=("--wandb_api_key $wandb_api_key"); fi
if [[ $log_tracker_name ]]; then extArgs+=("--log_tracker_name $log_tracker_name"); fi
python -m accelerate.commands.launch ${launchArgs[@]} --num_cpu_threads_per_process=8 "./sd-scripts/train_network.py" \
--enable_bucket \
--pretrained_model_name_or_path=$pretrained_model \
--train_data_dir=$train_data_dir \
--output_dir="./output" \
--logging_dir="./logs" \
--log_prefix=$output_name \
--resolution=$resolution \
--network_module=$network_module \
--max_train_epochs=$max_train_epoches \
--learning_rate=$lr \
--unet_lr=$unet_lr \
--text_encoder_lr=$text_encoder_lr \
--lr_scheduler=$lr_scheduler \
--lr_warmup_steps=$lr_warmup_steps \
--lr_scheduler_num_cycles=$lr_restart_cycles \
--network_dim=$network_dim \
--network_alpha=$network_alpha \
--output_name=$output_name \
--train_batch_size=$batch_size \
--save_every_n_epochs=$save_every_n_epochs \
--mixed_precision="fp16" \
--save_precision="fp16" \
--seed="1337" \
--cache_latents \
--prior_loss_weight=1 \
--max_token_length=225 \
--caption_extension=".txt" \
--save_model_as=$save_model_as \
--min_bucket_reso=$min_bucket_reso \
--max_bucket_reso=$max_bucket_reso \
--keep_tokens=$keep_tokens \
--xformers --shuffle_caption ${extArgs[@]}
# LoRA train script by @Akegarasu
$multi_gpu = 0 # multi gpu | 多显卡训练 该参数仅限在显卡数 >= 2 使用
$config_file = "./toml/default.toml" # config_file | 使用toml文件指定训练参数
$sample_prompts = "./toml/sample_prompts.txt" # sample_prompts | 采样prompts文件,留空则不启用采样功能
$utf8 = 1 # utf8 | 使用utf-8编码读取toml;以utf-8编码编写的、含中文的toml必须开启
# ============= DO NOT MODIFY CONTENTS BELOW | 请勿修改下方内容 =====================
# Activate python venv
.\venv\Scripts\activate
$Env:HF_HOME = "huggingface"
$ext_args = [System.Collections.ArrayList]::new()
$launch_args = [System.Collections.ArrayList]::new()
if ($multi_gpu) {
[void]$launch_args.Add("--multi_gpu")
}
if ($utf8 -eq 1) {
$Env:PYTHONUTF8 = 1
}
# run train
python -m accelerate.commands.launch $launch_args --num_cpu_threads_per_process=8 "./sd-scripts/train_network.py" `
--config_file=$config_file `
--sample_prompts=$sample_prompts `
$ext_args
Write-Output "Train finished"
Read-Host | Out-Null ;
#!/bin/bash
# LoRA train script by @Akegarasu
multi_gpu=0 # multi gpu | 多显卡训练 该参数仅限在显卡数 >= 2 使用
config_file="./toml/default.toml" # config_file | 使用toml文件指定训练参数
sample_prompts="./toml/sample_prompts.txt" # sample_prompts | 采样prompts文件,留空则不启用采样功能
utf8=1 # utf8 | 使用utf-8编码读取toml;以utf-8编码编写的、含中文的toml必须开启
# ============= DO NOT MODIFY CONTENTS BELOW | 请勿修改下方内容 =====================
export HF_HOME="huggingface"
export TF_CPP_MIN_LOG_LEVEL=3
extArgs=()
launchArgs=()
if [[ $multi_gpu == 1 ]]; then launchArgs+=("--multi_gpu"); fi
if [[ $utf8 == 1 ]]; then export PYTHONUTF8=1; fi
# run train
python -m accelerate.commands.launch ${launchArgs[@]} --num_cpu_threads_per_process=8 "./sd-scripts/train_network.py" \
--config_file=$config_file \
--sample_prompts=$sample_prompts \
${extArgs[@]}
extensions
extensions-disabled
repositories
venv
\ No newline at end of file
/* global module */
module.exports = {
env: {
browser: true,
es2021: true,
},
extends: "eslint:recommended",
parserOptions: {
ecmaVersion: "latest",
},
rules: {
"arrow-spacing": "error",
"block-spacing": "error",
"brace-style": "error",
"comma-dangle": ["error", "only-multiline"],
"comma-spacing": "error",
"comma-style": ["error", "last"],
"curly": ["error", "multi-line", "consistent"],
"eol-last": "error",
"func-call-spacing": "error",
"function-call-argument-newline": ["error", "consistent"],
"function-paren-newline": ["error", "consistent"],
"indent": ["error", 4],
"key-spacing": "error",
"keyword-spacing": "error",
"linebreak-style": ["error", "unix"],
"no-extra-semi": "error",
"no-mixed-spaces-and-tabs": "error",
"no-multi-spaces": "error",
"no-redeclare": ["error", {builtinGlobals: false}],
"no-trailing-spaces": "error",
"no-unused-vars": "off",
"no-whitespace-before-property": "error",
"object-curly-newline": ["error", {consistent: true, multiline: true}],
"object-curly-spacing": ["error", "never"],
"operator-linebreak": ["error", "after"],
"quote-props": ["error", "consistent-as-needed"],
"semi": ["error", "always"],
"semi-spacing": "error",
"semi-style": ["error", "last"],
"space-before-blocks": "error",
"space-before-function-paren": ["error", "never"],
"space-in-parens": ["error", "never"],
"space-infix-ops": "error",
"space-unary-ops": "error",
"switch-colon-spacing": "error",
"template-curly-spacing": ["error", "never"],
"unicode-bom": "error",
},
globals: {
//script.js
gradioApp: "readonly",
executeCallbacks: "readonly",
onAfterUiUpdate: "readonly",
onOptionsChanged: "readonly",
onUiLoaded: "readonly",
onUiUpdate: "readonly",
uiCurrentTab: "writable",
uiElementInSight: "readonly",
uiElementIsVisible: "readonly",
//ui.js
opts: "writable",
all_gallery_buttons: "readonly",
selected_gallery_button: "readonly",
selected_gallery_index: "readonly",
switch_to_txt2img: "readonly",
switch_to_img2img_tab: "readonly",
switch_to_img2img: "readonly",
switch_to_sketch: "readonly",
switch_to_inpaint: "readonly",
switch_to_inpaint_sketch: "readonly",
switch_to_extras: "readonly",
get_tab_index: "readonly",
create_submit_args: "readonly",
restart_reload: "readonly",
updateInput: "readonly",
//extraNetworks.js
requestGet: "readonly",
popup: "readonly",
// from python
localization: "readonly",
// progrssbar.js
randomId: "readonly",
requestProgress: "readonly",
// imageviewer.js
modalPrevImage: "readonly",
modalNextImage: "readonly",
// token-counters.js
setupTokenCounters: "readonly",
}
};
# Apply ESlint
9c54b78d9dde5601e916f308d9a9d6953ec39430
\ No newline at end of file
name: Bug Report
description: You think somethings is broken in the UI
title: "[Bug]: "
labels: ["bug-report"]
body:
- type: checkboxes
attributes:
label: Is there an existing issue for this?
description: Please search to see if an issue already exists for the bug you encountered, and that it hasn't been fixed in a recent build/commit.
options:
- label: I have searched the existing issues and checked the recent builds/commits
required: true
- type: markdown
attributes:
value: |
*Please fill this form with as much information as possible, don't forget to fill "What OS..." and "What browsers" and *provide screenshots if possible**
- type: textarea
id: what-did
attributes:
label: What happened?
description: Tell us what happened in a very clear and simple way
validations:
required: true
- type: textarea
id: steps
attributes:
label: Steps to reproduce the problem
description: Please provide us with precise step by step information on how to reproduce the bug
value: |
1. Go to ....
2. Press ....
3. ...
validations:
required: true
- type: textarea
id: what-should
attributes:
label: What should have happened?
description: Tell what you think the normal behavior should be
validations:
required: true
- type: input
id: commit
attributes:
label: Version or Commit where the problem happens
description: "Which webui version or commit are you running ? (Do not write *Latest Version/repo/commit*, as this means nothing and will have changed by the time we read your issue. Rather, copy the **Version: v1.2.3** link at the bottom of the UI, or from the cmd/terminal if you can't launch it.)"
validations:
required: true
- type: dropdown
id: py-version
attributes:
label: What Python version are you running on ?
multiple: false
options:
- Python 3.10.x
- Python 3.11.x (above, no supported yet)
- Python 3.9.x (below, no recommended)
- type: dropdown
id: platforms
attributes:
label: What platforms do you use to access the UI ?
multiple: true
options:
- Windows
- Linux
- MacOS
- iOS
- Android
- Other/Cloud
- type: dropdown
id: device
attributes:
label: What device are you running WebUI on?
multiple: true
options:
- Nvidia GPUs (RTX 20 above)
- Nvidia GPUs (GTX 16 below)
- AMD GPUs (RX 6000 above)
- AMD GPUs (RX 5000 below)
- CPU
- Other GPUs
- type: dropdown
id: cross_attention_opt
attributes:
label: Cross attention optimization
description: What cross attention optimization are you using, Settings -> Optimizations -> Cross attention optimization
multiple: false
options:
- Automatic
- xformers
- sdp-no-mem
- sdp
- Doggettx
- V1
- InvokeAI
- "None "
validations:
required: true
- type: dropdown
id: browsers
attributes:
label: What browsers do you use to access the UI ?
multiple: true
options:
- Mozilla Firefox
- Google Chrome
- Brave
- Apple Safari
- Microsoft Edge
- type: textarea
id: cmdargs
attributes:
label: Command Line Arguments
description: Are you using any launching parameters/command line arguments (modified webui-user .bat/.sh) ? If yes, please write them below. Write "No" otherwise.
render: Shell
validations:
required: true
- type: textarea
id: extensions
attributes:
label: List of extensions
description: Are you using any extensions other than built-ins? If yes, provide a list, you can copy it at "Extensions" tab. Write "No" otherwise.
validations:
required: true
- type: textarea
id: logs
attributes:
label: Console logs
description: Please provide **full** cmd/terminal logs from the moment you started UI to the end of it, after your bug happened. If it's very long, provide a link to pastebin or similar service.
render: Shell
validations:
required: true
- type: textarea
id: misc
attributes:
label: Additional information
description: Please provide us with any relevant additional info or context.
blank_issues_enabled: false
contact_links:
- name: WebUI Community Support
url: https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions
about: Please ask and answer questions here.
name: Feature request
description: Suggest an idea for this project
title: "[Feature Request]: "
labels: ["enhancement"]
body:
- type: checkboxes
attributes:
label: Is there an existing issue for this?
description: Please search to see if an issue already exists for the feature you want, and that it's not implemented in a recent build/commit.
options:
- label: I have searched the existing issues and checked the recent builds/commits
required: true
- type: markdown
attributes:
value: |
*Please fill this form with as much information as possible, provide screenshots and/or illustrations of the feature if possible*
- type: textarea
id: feature
attributes:
label: What would your feature do ?
description: Tell us about your feature in a very clear and simple way, and what problem it would solve
validations:
required: true
- type: textarea
id: workflow
attributes:
label: Proposed workflow
description: Please provide us with step by step information on how you'd like the feature to be accessed and used
value: |
1. Go to ....
2. Press ....
3. ...
validations:
required: true
- type: textarea
id: misc
attributes:
label: Additional information
description: Add any other context or screenshots about the feature request here.
## Description
* a simple description of what you're trying to accomplish
* a summary of changes in code
* which issues it fixes, if any
## Screenshots/videos:
## Checklist:
- [ ] I have read [contributing wiki page](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Contributing)
- [ ] I have performed a self-review of my own code
- [ ] My code follows the [style guidelines](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Contributing#code-style)
- [ ] My code passes [tests](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Tests)
name: Run Linting/Formatting on Pull Requests
on:
- push
- pull_request
jobs:
lint-python:
runs-on: ubuntu-latest
steps:
- name: Checkout Code
uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: 3.11
# NB: there's no cache: pip here since we're not installing anything
# from the requirements.txt file(s) in the repository; it's faster
# not to have GHA download an (at the time of writing) 4 GB cache
# of PyTorch and other dependencies.
- name: Install Ruff
run: pip install ruff==0.0.265
- name: Run Ruff
run: ruff .
lint-js:
runs-on: ubuntu-latest
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Install Node.js
uses: actions/setup-node@v3
with:
node-version: 18
- run: npm i --ci
- run: npm run lint
name: Run basic features tests on CPU with empty SD model
on:
- push
- pull_request
jobs:
test:
runs-on: ubuntu-latest
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Set up Python 3.10
uses: actions/setup-python@v4
with:
python-version: 3.10.6
cache: pip
cache-dependency-path: |
**/requirements*txt
launch.py
- name: Install test dependencies
run: pip install wait-for-it -r requirements-test.txt
env:
PIP_DISABLE_PIP_VERSION_CHECK: "1"
PIP_PROGRESS_BAR: "off"
- name: Setup environment
run: python launch.py --skip-torch-cuda-test --exit
env:
PIP_DISABLE_PIP_VERSION_CHECK: "1"
PIP_PROGRESS_BAR: "off"
TORCH_INDEX_URL: https://download.pytorch.org/whl/cpu
WEBUI_LAUNCH_LIVE_OUTPUT: "1"
PYTHONUNBUFFERED: "1"
- name: Start test server
run: >
python -m coverage run
--data-file=.coverage.server
launch.py
--skip-prepare-environment
--skip-torch-cuda-test
--test-server
--no-half
--disable-opt-split-attention
--use-cpu all
--add-stop-route
2>&1 | tee output.txt &
- name: Run tests
run: |
wait-for-it --service 127.0.0.1:7860 -t 600
python -m pytest -vv --junitxml=test/results.xml --cov . --cov-report=xml --verify-base-url test
- name: Kill test server
if: always()
run: curl -vv -XPOST http://127.0.0.1:7860/_stop && sleep 10
- name: Show coverage
run: |
python -m coverage combine .coverage*
python -m coverage report -i
python -m coverage html -i
- name: Upload main app output
uses: actions/upload-artifact@v3
if: always()
with:
name: output
path: output.txt
- name: Upload coverage HTML
uses: actions/upload-artifact@v3
if: always()
with:
name: htmlcov
path: htmlcov
__pycache__
*.ckpt
*.safetensors
*.pth
/ESRGAN/*
/SwinIR/*
/repositories
/venv
/tmp
/model.ckpt
/models/**/*
/GFPGANv1.3.pth
/gfpgan/weights/*.pth
/ui-config.json
/outputs
/config.json
/log
/webui.settings.bat
/embeddings
/styles.csv
/params.txt
/styles.csv.bak
/webui-user.bat
/webui-user.sh
/interrogate
/user.css
/.idea
notification.mp3
/SwinIR
/textual_inversion
.vscode
/extensions
/test/stdout.txt
/test/stderr.txt
/cache.json*
/config_states/
/node_modules
/package-lock.json
/.coverage*
# See https://pylint.pycqa.org/en/latest/user_guide/messages/message_control.html
[MESSAGES CONTROL]
disable=C,R,W,E,I
This diff is collapsed.
* @AUTOMATIC1111
# if you were managing a localization and were removed from this file, this is because
# the intended way to do localizations now is via extensions. See:
# https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Developing-extensions
# Make a repo with your localization and since you are still listed as a collaborator
# you can add it to the wiki page yourself. This change is because some people complained
# the git commit log is cluttered with things unrelated to almost everyone and
# because I believe this is the best overall for the project to handle localizations almost
# entirely without my oversight.
This diff is collapsed.
This diff is collapsed.
model:
base_learning_rate: 1.0e-04
target: ldm.models.diffusion.ddpm.LatentDiffusion
params:
linear_start: 0.00085
linear_end: 0.0120
num_timesteps_cond: 1
log_every_t: 200
timesteps: 1000
first_stage_key: "jpg"
cond_stage_key: "txt"
image_size: 64
channels: 4
cond_stage_trainable: false # Note: different from the one we trained before
conditioning_key: crossattn
monitor: val/loss_simple_ema
scale_factor: 0.18215
use_ema: False
scheduler_config: # 10000 warmup steps
target: ldm.lr_scheduler.LambdaLinearScheduler
params:
warm_up_steps: [ 10000 ]
cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
f_start: [ 1.e-6 ]
f_max: [ 1. ]
f_min: [ 1. ]
unet_config:
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
params:
image_size: 32 # unused
in_channels: 4
out_channels: 4
model_channels: 320
attention_resolutions: [ 4, 2, 1 ]
num_res_blocks: 2
channel_mult: [ 1, 2, 4, 4 ]
num_heads: 8
use_spatial_transformer: True
transformer_depth: 1
context_dim: 768
use_checkpoint: True
legacy: False
first_stage_config:
target: ldm.models.autoencoder.AutoencoderKL
params:
embed_dim: 4
monitor: val/rec_loss
ddconfig:
double_z: true
z_channels: 4
resolution: 256
in_channels: 3
out_ch: 3
ch: 128
ch_mult:
- 1
- 2
- 4
- 4
num_res_blocks: 2
attn_resolutions: []
dropout: 0.0
lossconfig:
target: torch.nn.Identity
cond_stage_config:
target: modules.xlmr.BertSeriesModelWithTransformation
params:
name: "XLMR-Large"
\ No newline at end of file
# File modified by authors of InstructPix2Pix from original (https://github.com/CompVis/stable-diffusion).
# See more details in LICENSE.
model:
base_learning_rate: 1.0e-04
target: modules.models.diffusion.ddpm_edit.LatentDiffusion
params:
linear_start: 0.00085
linear_end: 0.0120
num_timesteps_cond: 1
log_every_t: 200
timesteps: 1000
first_stage_key: edited
cond_stage_key: edit
# image_size: 64
# image_size: 32
image_size: 16
channels: 4
cond_stage_trainable: false # Note: different from the one we trained before
conditioning_key: hybrid
monitor: val/loss_simple_ema
scale_factor: 0.18215
use_ema: false
scheduler_config: # 10000 warmup steps
target: ldm.lr_scheduler.LambdaLinearScheduler
params:
warm_up_steps: [ 0 ]
cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
f_start: [ 1.e-6 ]
f_max: [ 1. ]
f_min: [ 1. ]
unet_config:
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
params:
image_size: 32 # unused
in_channels: 8
out_channels: 4
model_channels: 320
attention_resolutions: [ 4, 2, 1 ]
num_res_blocks: 2
channel_mult: [ 1, 2, 4, 4 ]
num_heads: 8
use_spatial_transformer: True
transformer_depth: 1
context_dim: 768
use_checkpoint: True
legacy: False
first_stage_config:
target: ldm.models.autoencoder.AutoencoderKL
params:
embed_dim: 4
monitor: val/rec_loss
ddconfig:
double_z: true
z_channels: 4
resolution: 256
in_channels: 3
out_ch: 3
ch: 128
ch_mult:
- 1
- 2
- 4
- 4
num_res_blocks: 2
attn_resolutions: []
dropout: 0.0
lossconfig:
target: torch.nn.Identity
cond_stage_config:
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
data:
target: main.DataModuleFromConfig
params:
batch_size: 128
num_workers: 1
wrap: false
validation:
target: edit_dataset.EditDataset
params:
path: data/clip-filtered-dataset
cache_dir: data/
cache_name: data_10k
split: val
min_text_sim: 0.2
min_image_sim: 0.75
min_direction_sim: 0.2
max_samples_per_prompt: 1
min_resize_res: 512
max_resize_res: 512
crop_res: 512
output_as_edit: False
real_input: True
model:
base_learning_rate: 1.0e-04
target: ldm.models.diffusion.ddpm.LatentDiffusion
params:
linear_start: 0.00085
linear_end: 0.0120
num_timesteps_cond: 1
log_every_t: 200
timesteps: 1000
first_stage_key: "jpg"
cond_stage_key: "txt"
image_size: 64
channels: 4
cond_stage_trainable: false # Note: different from the one we trained before
conditioning_key: crossattn
monitor: val/loss_simple_ema
scale_factor: 0.18215
use_ema: False
scheduler_config: # 10000 warmup steps
target: ldm.lr_scheduler.LambdaLinearScheduler
params:
warm_up_steps: [ 10000 ]
cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
f_start: [ 1.e-6 ]
f_max: [ 1. ]
f_min: [ 1. ]
unet_config:
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
params:
image_size: 32 # unused
in_channels: 4
out_channels: 4
model_channels: 320
attention_resolutions: [ 4, 2, 1 ]
num_res_blocks: 2
channel_mult: [ 1, 2, 4, 4 ]
num_heads: 8
use_spatial_transformer: True
transformer_depth: 1
context_dim: 768
use_checkpoint: True
legacy: False
first_stage_config:
target: ldm.models.autoencoder.AutoencoderKL
params:
embed_dim: 4
monitor: val/rec_loss
ddconfig:
double_z: true
z_channels: 4
resolution: 256
in_channels: 3
out_ch: 3
ch: 128
ch_mult:
- 1
- 2
- 4
- 4
num_res_blocks: 2
attn_resolutions: []
dropout: 0.0
lossconfig:
target: torch.nn.Identity
cond_stage_config:
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
model:
base_learning_rate: 7.5e-05
target: ldm.models.diffusion.ddpm.LatentInpaintDiffusion
params:
linear_start: 0.00085
linear_end: 0.0120
num_timesteps_cond: 1
log_every_t: 200
timesteps: 1000
first_stage_key: "jpg"
cond_stage_key: "txt"
image_size: 64
channels: 4
cond_stage_trainable: false # Note: different from the one we trained before
conditioning_key: hybrid # important
monitor: val/loss_simple_ema
scale_factor: 0.18215
finetune_keys: null
scheduler_config: # 10000 warmup steps
target: ldm.lr_scheduler.LambdaLinearScheduler
params:
warm_up_steps: [ 2500 ] # NOTE for resuming. use 10000 if starting from scratch
cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
f_start: [ 1.e-6 ]
f_max: [ 1. ]
f_min: [ 1. ]
unet_config:
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
params:
image_size: 32 # unused
in_channels: 9 # 4 data + 4 downscaled image + 1 mask
out_channels: 4
model_channels: 320
attention_resolutions: [ 4, 2, 1 ]
num_res_blocks: 2
channel_mult: [ 1, 2, 4, 4 ]
num_heads: 8
use_spatial_transformer: True
transformer_depth: 1
context_dim: 768
use_checkpoint: True
legacy: False
first_stage_config:
target: ldm.models.autoencoder.AutoencoderKL
params:
embed_dim: 4
monitor: val/rec_loss
ddconfig:
double_z: true
z_channels: 4
resolution: 256
in_channels: 3
out_ch: 3
ch: 128
ch_mult:
- 1
- 2
- 4
- 4
num_res_blocks: 2
attn_resolutions: []
dropout: 0.0
lossconfig:
target: torch.nn.Identity
cond_stage_config:
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
name: automatic
channels:
- pytorch
- defaults
dependencies:
- python=3.10
- pip=23.0
- cudatoolkit=11.8
- pytorch=2.0
- torchvision=0.15
- numpy=1.23
import os
import gc
import time
import numpy as np
import torch
import torchvision
from PIL import Image
from einops import rearrange, repeat
from omegaconf import OmegaConf
import safetensors.torch
from ldm.models.diffusion.ddim import DDIMSampler
from ldm.util import instantiate_from_config, ismap
from modules import shared, sd_hijack
cached_ldsr_model: torch.nn.Module = None
# Create LDSR Class
class LDSR:
def load_model_from_config(self, half_attention):
global cached_ldsr_model
if shared.opts.ldsr_cached and cached_ldsr_model is not None:
print("Loading model from cache")
model: torch.nn.Module = cached_ldsr_model
else:
print(f"Loading model from {self.modelPath}")
_, extension = os.path.splitext(self.modelPath)
if extension.lower() == ".safetensors":
pl_sd = safetensors.torch.load_file(self.modelPath, device="cpu")
else:
pl_sd = torch.load(self.modelPath, map_location="cpu")
sd = pl_sd["state_dict"] if "state_dict" in pl_sd else pl_sd
config = OmegaConf.load(self.yamlPath)
config.model.target = "ldm.models.diffusion.ddpm.LatentDiffusionV1"
model: torch.nn.Module = instantiate_from_config(config.model)
model.load_state_dict(sd, strict=False)
model = model.to(shared.device)
if half_attention:
model = model.half()
if shared.cmd_opts.opt_channelslast:
model = model.to(memory_format=torch.channels_last)
sd_hijack.model_hijack.hijack(model) # apply optimization
model.eval()
if shared.opts.ldsr_cached:
cached_ldsr_model = model
return {"model": model}
def __init__(self, model_path, yaml_path):
self.modelPath = model_path
self.yamlPath = yaml_path
@staticmethod
def run(model, selected_path, custom_steps, eta):
example = get_cond(selected_path)
n_runs = 1
guider = None
ckwargs = None
ddim_use_x0_pred = False
temperature = 1.
eta = eta
custom_shape = None
height, width = example["image"].shape[1:3]
split_input = height >= 128 and width >= 128
if split_input:
ks = 128
stride = 64
vqf = 4 #
model.split_input_params = {"ks": (ks, ks), "stride": (stride, stride),
"vqf": vqf,
"patch_distributed_vq": True,
"tie_braker": False,
"clip_max_weight": 0.5,
"clip_min_weight": 0.01,
"clip_max_tie_weight": 0.5,
"clip_min_tie_weight": 0.01}
else:
if hasattr(model, "split_input_params"):
delattr(model, "split_input_params")
x_t = None
logs = None
for _ in range(n_runs):
if custom_shape is not None:
x_t = torch.randn(1, custom_shape[1], custom_shape[2], custom_shape[3]).to(model.device)
x_t = repeat(x_t, '1 c h w -> b c h w', b=custom_shape[0])
logs = make_convolutional_sample(example, model,
custom_steps=custom_steps,
eta=eta, quantize_x0=False,
custom_shape=custom_shape,
temperature=temperature, noise_dropout=0.,
corrector=guider, corrector_kwargs=ckwargs, x_T=x_t,
ddim_use_x0_pred=ddim_use_x0_pred
)
return logs
def super_resolution(self, image, steps=100, target_scale=2, half_attention=False):
model = self.load_model_from_config(half_attention)
# Run settings
diffusion_steps = int(steps)
eta = 1.0
gc.collect()
if torch.cuda.is_available:
torch.cuda.empty_cache()
im_og = image
width_og, height_og = im_og.size
# If we can adjust the max upscale size, then the 4 below should be our variable
down_sample_rate = target_scale / 4
wd = width_og * down_sample_rate
hd = height_og * down_sample_rate
width_downsampled_pre = int(np.ceil(wd))
height_downsampled_pre = int(np.ceil(hd))
if down_sample_rate != 1:
print(
f'Downsampling from [{width_og}, {height_og}] to [{width_downsampled_pre}, {height_downsampled_pre}]')
im_og = im_og.resize((width_downsampled_pre, height_downsampled_pre), Image.LANCZOS)
else:
print(f"Down sample rate is 1 from {target_scale} / 4 (Not downsampling)")
# pad width and height to multiples of 64, pads with the edge values of image to avoid artifacts
pad_w, pad_h = np.max(((2, 2), np.ceil(np.array(im_og.size) / 64).astype(int)), axis=0) * 64 - im_og.size
im_padded = Image.fromarray(np.pad(np.array(im_og), ((0, pad_h), (0, pad_w), (0, 0)), mode='edge'))
logs = self.run(model["model"], im_padded, diffusion_steps, eta)
sample = logs["sample"]
sample = sample.detach().cpu()
sample = torch.clamp(sample, -1., 1.)
sample = (sample + 1.) / 2. * 255
sample = sample.numpy().astype(np.uint8)
sample = np.transpose(sample, (0, 2, 3, 1))
a = Image.fromarray(sample[0])
# remove padding
a = a.crop((0, 0) + tuple(np.array(im_og.size) * 4))
del model
gc.collect()
if torch.cuda.is_available:
torch.cuda.empty_cache()
return a
def get_cond(selected_path):
example = {}
up_f = 4
c = selected_path.convert('RGB')
c = torch.unsqueeze(torchvision.transforms.ToTensor()(c), 0)
c_up = torchvision.transforms.functional.resize(c, size=[up_f * c.shape[2], up_f * c.shape[3]],
antialias=True)
c_up = rearrange(c_up, '1 c h w -> 1 h w c')
c = rearrange(c, '1 c h w -> 1 h w c')
c = 2. * c - 1.
c = c.to(shared.device)
example["LR_image"] = c
example["image"] = c_up
return example
@torch.no_grad()
def convsample_ddim(model, cond, steps, shape, eta=1.0, callback=None, normals_sequence=None,
mask=None, x0=None, quantize_x0=False, temperature=1., score_corrector=None,
corrector_kwargs=None, x_t=None
):
ddim = DDIMSampler(model)
bs = shape[0]
shape = shape[1:]
print(f"Sampling with eta = {eta}; steps: {steps}")
samples, intermediates = ddim.sample(steps, batch_size=bs, shape=shape, conditioning=cond, callback=callback,
normals_sequence=normals_sequence, quantize_x0=quantize_x0, eta=eta,
mask=mask, x0=x0, temperature=temperature, verbose=False,
score_corrector=score_corrector,
corrector_kwargs=corrector_kwargs, x_t=x_t)
return samples, intermediates
@torch.no_grad()
def make_convolutional_sample(batch, model, custom_steps=None, eta=1.0, quantize_x0=False, custom_shape=None, temperature=1., noise_dropout=0., corrector=None,
corrector_kwargs=None, x_T=None, ddim_use_x0_pred=False):
log = {}
z, c, x, xrec, xc = model.get_input(batch, model.first_stage_key,
return_first_stage_outputs=True,
force_c_encode=not (hasattr(model, 'split_input_params')
and model.cond_stage_key == 'coordinates_bbox'),
return_original_cond=True)
if custom_shape is not None:
z = torch.randn(custom_shape)
print(f"Generating {custom_shape[0]} samples of shape {custom_shape[1:]}")
z0 = None
log["input"] = x
log["reconstruction"] = xrec
if ismap(xc):
log["original_conditioning"] = model.to_rgb(xc)
if hasattr(model, 'cond_stage_key'):
log[model.cond_stage_key] = model.to_rgb(xc)
else:
log["original_conditioning"] = xc if xc is not None else torch.zeros_like(x)
if model.cond_stage_model:
log[model.cond_stage_key] = xc if xc is not None else torch.zeros_like(x)
if model.cond_stage_key == 'class_label':
log[model.cond_stage_key] = xc[model.cond_stage_key]
with model.ema_scope("Plotting"):
t0 = time.time()
sample, intermediates = convsample_ddim(model, c, steps=custom_steps, shape=z.shape,
eta=eta,
quantize_x0=quantize_x0, mask=None, x0=z0,
temperature=temperature, score_corrector=corrector, corrector_kwargs=corrector_kwargs,
x_t=x_T)
t1 = time.time()
if ddim_use_x0_pred:
sample = intermediates['pred_x0'][-1]
x_sample = model.decode_first_stage(sample)
try:
x_sample_noquant = model.decode_first_stage(sample, force_not_quantize=True)
log["sample_noquant"] = x_sample_noquant
log["sample_diff"] = torch.abs(x_sample_noquant - x_sample)
except Exception:
pass
log["sample"] = x_sample
log["time"] = t1 - t0
return log
import os
from modules import paths
def preload(parser):
parser.add_argument("--ldsr-models-path", type=str, help="Path to directory with LDSR model file(s).", default=os.path.join(paths.models_path, 'LDSR'))
import os
from basicsr.utils.download_util import load_file_from_url
from modules.upscaler import Upscaler, UpscalerData
from ldsr_model_arch import LDSR
from modules import shared, script_callbacks, errors
import sd_hijack_autoencoder # noqa: F401
import sd_hijack_ddpm_v1 # noqa: F401
class UpscalerLDSR(Upscaler):
def __init__(self, user_path):
self.name = "LDSR"
self.user_path = user_path
self.model_url = "https://heibox.uni-heidelberg.de/f/578df07c8fc04ffbadf3/?dl=1"
self.yaml_url = "https://heibox.uni-heidelberg.de/f/31a76b13ea27482981b4/?dl=1"
super().__init__()
scaler_data = UpscalerData("LDSR", None, self)
self.scalers = [scaler_data]
def load_model(self, path: str):
# Remove incorrect project.yaml file if too big
yaml_path = os.path.join(self.model_path, "project.yaml")
old_model_path = os.path.join(self.model_path, "model.pth")
new_model_path = os.path.join(self.model_path, "model.ckpt")
local_model_paths = self.find_models(ext_filter=[".ckpt", ".safetensors"])
local_ckpt_path = next(iter([local_model for local_model in local_model_paths if local_model.endswith("model.ckpt")]), None)
local_safetensors_path = next(iter([local_model for local_model in local_model_paths if local_model.endswith("model.safetensors")]), None)
local_yaml_path = next(iter([local_model for local_model in local_model_paths if local_model.endswith("project.yaml")]), None)
if os.path.exists(yaml_path):
statinfo = os.stat(yaml_path)
if statinfo.st_size >= 10485760:
print("Removing invalid LDSR YAML file.")
os.remove(yaml_path)
if os.path.exists(old_model_path):
print("Renaming model from model.pth to model.ckpt")
os.rename(old_model_path, new_model_path)
if local_safetensors_path is not None and os.path.exists(local_safetensors_path):
model = local_safetensors_path
else:
model = local_ckpt_path if local_ckpt_path is not None else load_file_from_url(url=self.model_url, model_dir=self.model_download_path, file_name="model.ckpt", progress=True)
yaml = local_yaml_path if local_yaml_path is not None else load_file_from_url(url=self.yaml_url, model_dir=self.model_download_path, file_name="project.yaml", progress=True)
try:
return LDSR(model, yaml)
except Exception:
errors.report("Error importing LDSR", exc_info=True)
return None
def do_upscale(self, img, path):
ldsr = self.load_model(path)
if ldsr is None:
print("NO LDSR!")
return img
ddim_steps = shared.opts.ldsr_steps
return ldsr.super_resolution(img, ddim_steps, self.scale)
def on_ui_settings():
import gradio as gr
shared.opts.add_option("ldsr_steps", shared.OptionInfo(100, "LDSR processing steps. Lower = faster", gr.Slider, {"minimum": 1, "maximum": 200, "step": 1}, section=('upscaling', "Upscaling")))
shared.opts.add_option("ldsr_cached", shared.OptionInfo(False, "Cache LDSR model in memory", gr.Checkbox, {"interactive": True}, section=('upscaling', "Upscaling")))
script_callbacks.on_ui_settings(on_ui_settings)
from modules import extra_networks, shared
import lora
class ExtraNetworkLora(extra_networks.ExtraNetwork):
def __init__(self):
super().__init__('lora')
def activate(self, p, params_list):
additional = shared.opts.sd_lora
if additional != "None" and additional in lora.available_loras and not any(x for x in params_list if x.items[0] == additional):
p.all_prompts = [x + f"<lora:{additional}:{shared.opts.extra_networks_default_multiplier}>" for x in p.all_prompts]
params_list.append(extra_networks.ExtraNetworkParams(items=[additional, shared.opts.extra_networks_default_multiplier]))
names = []
multipliers = []
for params in params_list:
assert params.items
names.append(params.items[0])
multipliers.append(float(params.items[1]) if len(params.items) > 1 else 1.0)
lora.load_loras(names, multipliers)
if shared.opts.lora_add_hashes_to_infotext:
lora_hashes = []
for item in lora.loaded_loras:
shorthash = item.lora_on_disk.shorthash
if not shorthash:
continue
alias = item.mentioned_name
if not alias:
continue
alias = alias.replace(":", "").replace(",", "")
lora_hashes.append(f"{alias}: {shorthash}")
if lora_hashes:
p.extra_generation_params["Lora hashes"] = ", ".join(lora_hashes)
def deactivate(self, p):
pass
This diff is collapsed.
import os
from modules import paths
def preload(parser):
parser.add_argument("--lora-dir", type=str, help="Path to directory with Lora networks.", default=os.path.join(paths.models_path, 'Lora'))
import os
from modules import paths
def preload(parser):
parser.add_argument("--scunet-models-path", type=str, help="Path to directory with ScuNET model file(s).", default=os.path.join(paths.models_path, 'ScuNET'))
import os
from modules import paths
def preload(parser):
parser.add_argument("--swinir-models-path", type=str, help="Path to directory with SwinIR model file(s).", default=os.path.join(paths.models_path, 'SwinIR'))
import gradio as gr
from modules import shared
shared.options_templates.update(shared.options_section(('canvas_hotkey', "Canvas Hotkeys"), {
"canvas_hotkey_zoom": shared.OptionInfo("Alt", "Zoom canvas", gr.Radio, {"choices": ["Shift","Ctrl", "Alt"]}).info("If you choose 'Shift' you cannot scroll horizontally, 'Alt' can cause a little trouble in firefox"),
"canvas_hotkey_adjust": shared.OptionInfo("Ctrl", "Adjust brush size", gr.Radio, {"choices": ["Shift","Ctrl", "Alt"]}).info("If you choose 'Shift' you cannot scroll horizontally, 'Alt' can cause a little trouble in firefox"),
"canvas_hotkey_move": shared.OptionInfo("F", "Moving the canvas").info("To work correctly in firefox, turn off 'Automatically search the page text when typing' in the browser settings"),
"canvas_hotkey_fullscreen": shared.OptionInfo("S", "Fullscreen Mode, maximizes the picture so that it fits into the screen and stretches it to its full width "),
"canvas_hotkey_reset": shared.OptionInfo("R", "Reset zoom and canvas positon"),
"canvas_hotkey_overlap": shared.OptionInfo("O", "Toggle overlap").info("Technical button, neededs for testing"),
"canvas_show_tooltip": shared.OptionInfo(True, "Enable tooltip on the canvas"),
"canvas_disabled_functions": shared.OptionInfo(["Overlap"], "Disable function that you don't use", gr.CheckboxGroup, {"choices": ["Zoom","Adjust brush size", "Moving canvas","Fullscreen","Reset Zoom","Overlap"]}),
}))
stable-diffusion-webui/html/card-no-preview.png

82.5 KB

This diff is collapsed.
<div class='nocards'>
<h1>Nothing here. Add some content to the following directories:</h1>
<ul>
{dirs}
</ul>
</div>
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment