卢子期

不倒失衡

Mon, 15 Dec 2025 00:00:00 +0000

不倒失衡

在《不倒失衡》中，我专注于使用 Substance Designer 创建程序化风格纹理，包括金属网格、混凝土、木材和涂层金属锈迹，同时通过贴图压缩优化内存性能。我还使用 Niagara 制作风格化特效，如烟雾和火光，并创建海面材质以丰富游戏世界环境的视觉效果。

泰坦计划

Mon, 15 Dec 2025 00:00:00 +0000

泰坦计划

本项目是一个基于教程的技术性学习与验证项目，旨在通过在 Unreal Engine 5 中复刻一个 3D 环境，来研究并验证 Houdini 的程序化工作流。项目的目标是理解、评估并在接近实际制作的条件下测试教程系列中所介绍的程序化资产管线与工作流程，以及它们与 UE5 现代技术的集成方式。项目涵盖多种数字资产工具的复刻与优化，包括程序化标牌、建筑生成、铁路系统、火车破坏特效、VAT 角色、树木风效、篱笆、平台、灌木、常春藤、火车、布料、堆叠和电缆工具等。通过这些练习，我不仅复现了教程中展示的功能，还理解了程序化生成、材质与特效制作、模拟优化及引擎集成流程，为后续在游戏环境中高效创作复杂场景积累了经验。

错节复奏

Thu, 13 Nov 2025 00:00:00 +0000

错节复奏

我作为开发承担了除了游戏策划和关卡设计以外的所有部分，包括程序开发、资产和音效制作。

值得一提的是，在策划提出了“子弹时间配合玻璃破碎的物理模拟”这一需求之后，我尝试实现的原型版本遭遇了严重的性能问题。我深入unity的调用栈分析，最终找到了性能瓶颈在于物理模拟模块的创建和销毁，并在房间地形的移动中参与了模拟过程，而这是不需要的。因此我将玻璃碎片于游戏启动时预先创建，并在地形改变时暂时冻结碎片参与物理模拟的“权限”，从而解决了性能问题。

福尔摩斯：完美犯罪

Mon, 03 Nov 2025 00:00:00 +0000

错节复奏

在这个大型项目中，我负责了更加专项的工作，包括场景光照、英雄物品和材质的制作。具体上来说，包括游戏中“化学实验室组件”“毒药痕迹”等游戏设计强相关的内容。

活字生形

Thu, 02 Oct 2025 00:00:00 +0000

活字生形

我的主要贡献是人工智能光学字符识别（OCR）插件系统。我将最先进的开源模型PPOCR-v5集成到虚幻引擎中，支持核心游戏玩法。

除此之外，作为一名技术艺术家，我还帮助创作漫画风格化的着色器、视觉效果、环境艺术设计和所有照明。

TerraCraft: City-scale generative procedural modeling with natural languages

Fri, 27 Jun 2025 00:00:00 +0000

这项工作是由我的结果推动的.

WaterLOD

Mon, 09 Jun 2025 00:00:00 +0000

WaterLOD摘要

随着计算机辅助模拟研究的进展，岩土工程和水力领域对大规模三维流体模拟的需求越来越大。这些领域需要处理大规模、高精度的流体模拟数据。然而，传统的可视化技术存在效率低、真实性差、缺乏直观性等局限性。同时，虚拟现实和增强现实等当前流行技术的广阔应用前景要求开发照片级逼真的场景流体可视化。

因此，本研究专注于大规模流体粒子数据的照片级逼真场景可视化，旨在快速、交互式地可视化数亿级左右的粒子数据。本研究首先通过对输入数据特征的深入观察，将连续细节技术引入流体颗粒数据领域，以减少性能浪费；同时，通过监控硬件性能，智能预加载未来的流体动画帧可以提高视频渲染的平滑度。本研究的核心是在VTK的OpenGL扩展中开发vtkOpenGLFluidMapper类。通过应用连续细节层次技术和未来帧预加载和传输优化技术，建立了一种高效的流体粒子数据场景可视化方法，并实现了一个原型系统，以验证该算法的可行性和优越性。该研究在不显著影响视觉效果的情况下提高了真实性可视化的效率。

FlameGS

Sun, 08 Sep 2024 00:00:00 +0000

Reconstructing Detailed Facial Mesh from Monocular or Multicam Videos

July 2024 - Sept. 2024

在犹他大学杨垠教授指导下的实习.

开发了一套全面的推理流程，从单目或多摄像机视频源重建照片级逼真的面部网格、纹理和动画，增强了虚拟角色在各种应用中的真实感和实用性
将FLAME可微分参数化人脸模型与高斯溅射法相结合，以高效捕捉极端数据分布下的面部特征
针对网格和高斯混合的Avatar表示，制定了一种传输算法，以便进一步模拟人脸运动。在表情、姿势和视角方面完全可控且用户友好
与最先进的方法相比，实现了相当的图像相似性和多视图一致性，同时生成了时间上更平滑的姿势和表情动画

Copyright

Header image copyright: by Technical University of Munich

Procedural Texturing | Real-time Rendering Chapter 6.3

Sun, 01 Sep 2024 00:00:00 +0200

Procedural Texturing

Although procedural textures are commonly used in offline rendering applications, image textures are far more common in real-time rendering. This is due to the extremely high efficiency of the image texturing hardware in modern GPUs, which can perform many billions of texture accesses in a second. However, GPU architectures are evolving toward less expensive computation and (relatively) more costly memory access. These trends have made procedural textures find greater use in real-time applications.

Volume textures are a particularly attractive application for procedural texturing, given the high storage costs of volume image textures. One of the most common is using one or more noise functions to generate values. A noise function is often sampled at successive powers-of-two frequencies, called octaves. Each octave is given a weight, usually falling as the frequency increases, and the sum of these weighted samples is called a turbulence function.

Other procedural methods are possible. For example, a cellular texture is formed by measuring distances from each location to a set of “feature points” scattered through space. Mapping the resulting closest distances in various ways.

When generating a procedural two-dimensional texture, parameterization issues (UV) can pose even more difficulties than for authored textures, where stretching or seam artifacts can be manually touched up or worked around.

Antialiasing procedural textures is both harder and easier than antialiasing image textures. On one hand, precomputation methods such as mipmapping are not available, putting the burden on the programmer. On the other, the procedural texture author has “inside information” about the texture content and so can tailor it to avoid aliasing. This is particularly true for procedural textures created by summing multiple noise functions. The frequency of each noise function is known, so any frequencies that would cause aliasing can be discarded, actually making the computation less costly.

Premultiplied Alphas and Compositing | Real-time Rendering Chapter 5.5.3

Sat, 17 Feb 2024 00:00:00 +0200

Premultiplied Alphas and Compositing

The over operator is also used for blending together photographs or synthetic renderings of objects. This process is called compositing. The image formed by the alpha channel is sometimes called the matte. It shows the silhouette shape of the object.

One way to use synthetic RGBα data is with premultiplied alphas (also known as associated alphas). That is, the RGB values are multiplied by the alpha value before being used. This makes the compositing over equation more efficient:

$$\mathbf{c_O=c_S'+(1-\alpha_S)c_d}$$

where $\mathbf{c_S'}$ is the premultiplied source channel.Premultiplied alpha also makes it possible to use over and additive blending without changing the blend state, since the source color is now added in during blending.

Rendering synthetic images dovetails naturally with premultiplied alphas. An antialiased opaque object rendered over a black background provides premultiplied values by default.

Another way images are stored is with unmultiplied alphas, also known as unassociated alphas or even as the mind-bending term nonpremultiplied alphas.It is best to use premultiplied data whenever filtering and blending is performed, as operations such as linear interpolation do not work correctly using unmultiplied alphas. Artifacts such as black fringes around the edges of objects can result.

For image-manipulation applications, an unassociated alpha is useful to mask a photograph without affecting the underlying image’s original data. Also, an unassociated alpha means that the full precision range of the color channels can be used.

care must be taken to properly convert unmultiplied RGBα values to and from the linear space used for computer graphics computations. {: .prompt-warning }

Image file formats that support alpha include PNG (unassociated alpha only), OpenEXR (associated only), and TIFF (both types of alpha).

3D建模雕刻

Fri, 12 Jan 2024 00:00:00 +0000

光之翼MV

Fri, 12 Jan 2024 00:00:00 +0000

时候舞蹈

Fri, 12 Jan 2024 00:00:00 +0000

MMD

MikuMikuDance是一款3D动画制作软件，为索尼/世嘉Vocaloid产品的粉丝制作。同时，也是我的计算机图形学和动画启蒙。

水文学数据处理脚本

Fri, 12 Jan 2024 00:00:00 +0000

The script is developed based on ArcGIS python library for hydrology class research in Tsinghua University. All the source code was open-sourced for next year???s students.

Intro 2 CUDA

Tue, 12 Dec 2023 00:00:00 +0000

Intro 2 CUDA

Streams

Page-Locked Host Memory

cudaHostAlloc((void**)&a, N*sizeof(int), cudaHostAllocDefault);
cudaFreeHost(a);

page-locked / pinned host memory: os guarantees that the memory is resident in physical memory and won’t be paged out to disk.

simultaneously pinned memory opt out of the feature of virtual memory.

Multiple Streams

cudaStream_t stream1, stream2;
cudaStreamCreate(&stream1);
cudaStreamCreate(&stream2);
cudaMemcpyAsync(d_a, a, N*sizeof(int), cudaMemcpyHostToDevice, stream1);
cudaMemcpyAsync(d_b, b, N*sizeof(int), cudaMemcpyHostToDevice, stream2);
kernel<<<grid1, block1, 0, stream1>>>(d_a, N);
kernel<<<grid2, block2, 0, stream2>>>(d_b, N);
cudaStreamSynchronize(stream1);
cudaStreamSynchronize(stream2);
cudaStreamDestroy(stream1);
cudaStreamDestroy(stream2);

GPU Work Schedule

Be aware of the GPU work schedule. There are different execution units to execute different types of instructions, such as copy, compute, and so on. And the order of code dependencies is equal to the order written in the code.

Multi-GPU

Zero-Copy Host Memory

cudaHostAlloc((void**)&a, N*sizeof(int), cudaHostAllocWriteCombined | cudaHostAllocMapped);

cudaHostAllocWriteCombined: this flag indicates that the runtime should allocate the buffer as write-combined, which will not change functionality in application but represents a performance enhancement for buffers that will be read only by the GPU.

Write-combined memory can be extremely inefficient in scenarios where CPU also needs to perform reads from the buffer.

cudaHostAllocMapped: the buffers can be accessed from the GPU. However, since there is a difference between the virtual address space of the CPU and the GPU, the call to cudaHostAlloc() will return a CPU pointer, which is then mapped to a GPU pointer using cudaHostGetDevicePointer().

Portable Pinned Memory

This is neccesary when you use multiple GPUs.

cudaHostAlloc((void**)&a, N*sizeof(int), cudaHostAllocPortable);

When a buffer is allocated as pinned, they will only appear page-locked to the thread that allocated them. If another thread tries to access the buffer, they will see the buffer as standard pageable memory.

To support portable pinned memory and zero-copy memory in multi-GPU systems, the code need two notable changes:

void* function(void* arg) {
 if(arg->deviceID != 0) {
 cudaSetDevice(arg->deviceID);
 cudaSetDeviceFlags(cudaDeviceMapHost);
 }
}

We need a call to cudaSetDevice() to enable every thread controls a different GPU.

In addition, as we use zero-copy in order to access these buffers directly from the GPU, we use cudaHostGetDevicePointer() to get the valid device pointers for the host memory.

float *a, *b, *partial_c;
float *dev_a, *dev_b, *dev_partial_c;

//allocate memory on the CPU side
a = data->a;
b = data->b;
partial_c = (float *)malloc(blocksPerGrid *　sizeof(float));

cudaHostGetDevicePointer(&dev_a, a, 0);
cudaHostGetDevicePointer(&dev_b, b, 0);
cudaMalloc((void**)&dev_partial_c, blocksPerGrid * sizeof(float));

dev_a += data->offset;
dev_b += data->offset;

kernel<<<blocksPerGrid, threadsPerBlock>>>(dev_a, dev_b, dev_partial_c);

Experience

Tue, 24 Oct 2023 00:00:00 +0000

Contrast Adjustments

Thu, 27 Jul 2023 00:00:00 +0000

调整反差的工具

LIFT GAMMA GAIN

Lift: 调整趾部
Gamma：更改中间调的分布
Gain：调整高光

Contrast & Pivot

Contrast：线性地改变图像的黑白位 Pivot：分配Contrast在黑位白位之间区域的权重

Y’CbCr/RGB下的LUMA调整

大多数调色系统的一级反差控制使用RGB图像处理方式，即调整图像亮度时，是对所有三个色彩分量进行等量且同步调整的。由此产生的调整，会对图像饱和度产生明显影响。单独操控Y’CbCr的Y通道，对图像的饱和度没有可测量的效果（即矢量波形保持不变）。然而，图像的感知饱和度确实有改变。

用LOG调色控件微调反差

使用OFFSET、EXPOSURE和CONTARST微调反差

在套用LUT之前使用这些控件！这样可以控制图像反差并找回一些图像细节，因为可能在套用LUT时会被裁切

使用SHADOW、MIDTONE和HIGHLIGHT微调反差

他们被设计为用于log素材正常化之前的调整

这些控件对图像的影响完全取决于图像本身有多少反差。 Shadow：影响影调的底部三分之一 Midtone：影响范围广泛的中间调 Highlight：影响高光的顶部三分之一

设置适当的高光和暗部

对高光的感知与阴影的深度有关。某些时候压低阴影而高光保留不动比提高高光更好。

保持中间调的同时将白位合法化

压低Gain的同时将中间调提高以补偿影响，也可以通过降低一点暗部来保持反差。

尝试用降低中间调替代压低暗部的操作

不是每个图像都要求把暗部压到0%，有时候降低中间调可以达到相同的效果。

反差和视觉感知

利用环境效应

提高高光令阴影看起来更暗，同时尽量避免平均中间调看起来比原来更亮。

提高Gain
为了减少图像的感知亮度，降低Gamma。这导致高光减少了一点，但是上步中提起高光幅度很大，不会有太大影响
降低中间调的同时阴影降低的有点太多，提高Lift作为补偿

增加图像感知锐度

调整反差的另一个效应：增加对比度可以让画面细节变得更锐。

如何处理曝光问题素材

曝光不足

试图增加亮度和中间调时，通常：

会增加噪点
饱和度过高/欠饱和
暗部细节不足

提高中间调比调整白位更好

使用曲线控制器，将中间调偏低区域伸展，同时保持中间调偏上的区域不变以获得扎实的阴影。

过曝

先做取舍：是否需要压过曝部分

如果过曝面积不太大：

为过曝区域添加颜色

降低Gain，将高光合法化
如果有需要，用Gamma提高中间调
使用HSL选色的Luma控件
将Gain推向某种颜色，即补色

为过曝区域添加辉光

若夸张的过曝不能避免，光晕和溢出可以软化过渡区域的边缘，也能让画面更加舒服

降低Gain，将高光合法化，同时提高Gamma补偿
使用HSL选色的Luma控件分离出过曝区域
使用HSL选色中的Blur或Soften，模糊键控蒙版，最终要令蒙版比曝光过度的区域更大，边缘更柔软
用Lift或Gamma来提亮选中的键控区域，类似晕染和溢出会逐渐出现。

使用通道混合器重建被裁切的通道

由于显著过曝而造成的某个通道分配不均，导致难看的图像高光。检查RGB分量示波器，如果某个通道远高于110%，但是其他通道都没有，那么可以使用这类方法。

这是最后的方法，目的是找回更多图像数据而不是让图像更漂亮。

调整色彩和反差，达到色彩平衡（一级校色），先别担心裁切位置，只需要进行整体校正。在这种情况下，offset来重新编排三个色彩通道的位置，先降低红色通道，再使用“Master Offset”来降低整体图像信号，使图像阴影密度更多
创建新节点进行通道修复。这次调整所作的校正将要放在之前的校正上。也即使用Layer Mixer
选择新节点，检查哪个通道在被裁切的区域内有更多的细节。
使用 RGB Mixer / Channel Blending / Channel isolation / Blend mode来混和绿色与蓝色通道给红色通道。
使用Luma限定控件分离出过曝高光区域，然后柔化边缘并使用该键控蒙版来限制通道混合的影响范围。
最后添加额外的校色操作，整体图像进行调色

Micro-PT

Tue, 28 Jun 2022 00:00:00 +0000

特性

分布式路径追踪/随机渐进光子映射
OpenMP多线程
次世代微表面PBR材质
边界体积层次加速

嵌入式旋转音律

Thu, 02 Dec 2021 00:00:00 +0000