Format Modeling and API Semantic Analysis in GE【免费下载链接】geGEGraph Engine是面向昇腾的图编译器和执行器提供了计算图优化、多流并行、内存复用和模型下沉等技术手段加速模型执行效率减少模型内存占用。 GE 提供对 PyTorch、TensorFlow 前端的友好接入能力并同时支持 onnx、pb 等主流模型格式的解析与编译。项目地址: https://gitcode.com/cann/ge1. Why Format Becomes a Performance ProblemIn deep learning models, when users construct computational graphs, they typically focus oncomputational semantics itself: tensor dimensions, mathematical meaning of operators, and dependencies between operators etc.At this level, how data looks like is often considered obvious, and doesnt need extra attention. However, when models enter actual execution phase, this taken for granted assumption often no longer holds.1.1 Gap Between User Semantics and Actual ExecutionFrom user perspective, a tensor is just a set of ordered multidimensional data; from execution perspective, this data needs to be stored in somespecific memory layoutto be efficiently accessed by hardware. Format in GE represents exactly this memory layout. In industry graph compilers, this concept is usually also called Layout. For example, NCHW and NHWC in GE belong to two different formats.Different computational operators often havedifferent preferencesfor formats, for example:For Conv2D operator, preferred image input format is NC1HWC0, preferred filter input format is FZFor MatMul operator, preferred weight input format is NZThese differences dont come from the algorithm itself, but originate fromimplementation characteristics of underlying hardware architecture.1.2 Cost of Data Rearrangement Is Not FreeWhen operators have preferred formats, GE usually needs to insert additionaldata rearrangementoperations to convert inputs to formats more suitable for that operators computation. However, this kind of data rearrangement is not cheap in performance:Will introduce additional computation overhead and memory bandwidth consumptionMay be triggered multiple times in complex modelsMore importantly, these data rearrangements oftenwont appear in user explicitly constructed computational graph, yet will directly affect models actual execution efficiency.1.3 This Is a Systemic ProblemAn intuitive idea is:Since some operators are more sensitive to data layout, let each operator handle its input/output formats.But in engineering practice, this is not the optimal solution. Taking a multi-layer convolutional network as example:If operators act independently, Conv2D will internally convert input from NCHW to NC1HWC0, after computation convert output back to NCHW. At this point, actual execution process will evolve into:For single Conv2D operator, using preferred format through data rearrangement can obtain better computational performance; but from entire network perspective, TransData repeatedly appears at each layer, significantly increasing overall execution overhead.Therefore, format-related problems are essentially anetwork-wide systemic optimization problem: under premise of ensuring computational semantic correctness, need to select appropriate data layout for different operators, and minimize unnecessary data conversions.Exactly in such context, GE introduced aunified format modeling and optimization mechanism, to systematically handle the gap between user semantics and actual execution.2. Origin and Storage: Two Representation Systems in GETo solve the data layout and performance problems mentioned in previous chapter, GE internally made explicit distinction for tensor representation, introduced two interconnected but differently responsible representation systems:OriginandStorage.These two representations respectively describe users original semantics and data form during operator actual execution, are the basis for GE to perform format modeling and optimization.2.1 Origin: Expression and Propagation of User SemanticsOrigin is used to describeoriginal semantics expressed by user when constructing computational graph, including:OriginFormat: Tensors format description at semantic level, e.g. NCHWOriginShape: Tensors dimension information at semantic level, e.g. [8, 3, 224, 224]Origins source is usually frontend framework or user explicitly given model definition, its core characteristics are:Directly reflects user intentDoes not contain any assumptions targeting specific hardware or implementationDoes not adjust for performance goalsWhen GE receives a computational graph, Origin is usuallyexplicitly givenby graph inputs and some key operators attributes (e.g. Conv2D marks its input/output formats via attributes). GE will propagate Origin throughout computational graph as much as possible, its purpose is not for performance optimization, but toalways retain complete understanding of users original computational semanticsthroughout compilation process.This propagation mechanism provides clear semantic boundary for subsequent optimization, making any form of format adjustment or execution optimization must be built on premise ofnot destroying Origin semantics.2.2 Storage: Representation of Actual Computation and StorageUnlike Origin, Storage is used to describe representation form adopted by tensor inactual execution phase, including:StorageFormat: Tensors specific layout method in memory, e.g. NCHWC0, splits C axis into C0, C1StorageShape: Tensors actual form in memory, e.g. NCHW format with Shape [8, 3, 224, 224], after converting to NC1HWC0, Shape is [8, 1, 224, 224, 16]Storage is not specified by user, but derived by GE during compilation process based on multiple factors, e.g.:Operator capabilities and limitationsDifferent operators format affinityNetwork-wide data flow relationships- Different operators affinity to formatsWhole graph scope data flow relationshipsSince not all operators support all formats, Storage derivation process naturally受约束. For example, some formats may only be valid for specific operators or specific inputs of operators (like weights).Therefore, Storage represents anexecution-oriented engineering choice, its goal is while满足 operator capability constraints前提, minimize inserting format conversions (TransData), to obtain overall better execution efficiency.2.3 Summary: Origin and Storage Division of Labor Cooperation RelationshipIn GE, Origin and Storage division of labor cooperation relationship can概括为:Origin correctly defines user semantics, doesnt participate in performance trade-offsStorage faces execution performance optimization, but must服从 Origin semanticsThis division enables GE to flexibly adjust execution-layer Format while guaranteeing user semantics correctness, optimizing performance.3. Format Optimization Basic PrinciplesAfter clarifying Origin and Storage two representation systems, problems GE needs to solve can归结为 two points:How to as accurately as possible understand users original semantics for format in whole computational graphOn this basis, how to select suitable execution format for operators, to obtain overall better execution efficiencyCentering on these two problems, GEs Format optimization follows a clear principle path:first understand semantics, then optimize execution.3.1 Based on Origin Whole Graph Format Semantic DerivationFormat optimization first step doesnt directly involve performance, but尽可能restore and understand format semantics in whole computational graph.GE will take computational graph input formats, and format-sensitive operators (e.g. Conv2D, must clearly specify input format during computation) as anchors, propagate forward and backward in computational graph, try to derive each operators input and output OriginFormat.This process goal is尽可能 expand understanding scope of users original format semantics, provide reliable semantic foundation for subsequent optimization.3.2 Format Semantic Derivation Interruption and UncertaintyIn actual computational graphs, not all operators can maintain format semantics continuous propagation.When encountering operators that change tensor dimension semantics (e.g. Reshape), original format semantics往往 no longer holds. At this point, GE will认为 format semantics interrupted at this location, and mark Reshape peers format as unknown (usually ND).This interruption isnt failure, but active marking of semantic boundary, avoiding making wrong format deductions without sufficient information.3.3 Based on Operator Capability StorageFormat Selection and PropagationAfter completing whole graph scope OriginFormat derivation, GE entersexecution-layer format selection phase.At this point, Format optimization focus shifts from whether semantics correct to how to obtain better execution efficiency. StorageFormat selection isnt direct mapping of OriginFormat, but needs comprehensive consideration of following factors:Operators support capability for execution formatsDifferent operators affinity to specific formatsWhole graph scope overall execution efficiencyIn this phase, GE始终遵循 a premise:dont破坏 already confirmed Origin semantics.Since different operators impact on overall performance isnt均衡, GE will优先关注 computationally expensive operators (e.g. convolution, matrix multiplication),尽量 select their more affine StorageFormat for these operators. After key operators determine execution format, GE then centers on them, combines upstream/downstream adjacent operators capabilities and constraints, propagates and coordinates StorageFormat, avoiding introducing unnecessary format conversions on critical paths.Using computational graph from section 1.3 as example, after completing OriginFormat derivation, Format optimization will anchor on computationally expensive operator Conv2D, select its affine StorageFormat (NC1HWC0). Since subsequent ReLU operator also supports NC1HWC0, format can propagate backward along computation path and maintain consistency, finally obtaining following execution format layout:3.4 Shape and Format Division of Labor in Derivation ProcessIn GE, Shape and Format derivation承担 different roles.OriginShape derivation遵循 common InferShape process in graph compilers: it takes computational graph input Shape (i.e. user-understood Shape) as starting point, derives layer by layer forward according to operator semantics, until graph output.Unlike this, StorageShape isnt独立 derivation result. When OriginShape, OriginFormat and StorageFormat all determined, StorageShape can naturally calculate based on StorageFormats corresponding memory layout method.This division decouples Shape semantic derivation from execution-layer Tensor Format, enabling format optimization to proceed independently without干扰 semantic derivation.4. Understanding Format/Shape Interfaces and Types from GE External API PerspectiveThis chapter explains how Format/Shape are expressed at interface and type level from GE external API perspective, and explains possible understanding deviations between concept ↔ class name.4.1 Interface Layer: GetShape / GetOriginShape / GetStorageShapeIn external interfaces, Shape/Format usually provide three types of access interfaces:GetOriginShape()/GetOriginFormat()GetStorageShape()/GetStorageFormat()GetShape()/GetFormat()Where:GetOrigin*()explicitly returnsOriginperspective info, used to express user semantics.GetStorage*()explicitly returnsStorageperspective info, used to describe actual execution-related info.Get*()(without Origin/Storage prefix)doesnt explicitly specify perspective, so returns info simultaneously containing Origin and Storage parts.In other words,GetShape()meaning isnt only return one kind of Shape, but returns an object able to simultaneously express Origin and Storage; Format related interfaces同理.This interface design value在于:When needing semantic info, caller can explicitly useGetOrigin*()When needing execution info, caller can explicitly useGetStorage*()If caller wants to一次性 obtain complete description, useGet*()4.2 Type (class) Layer: Shape / StorageShape / StorageFormat Responsibility Boundaries4.2.1 Shape: Pure Data Structure, Not Binding SemanticsShapeis a pure data structure class, only负责 expressing a shape. Therefore:Shapecan be used to承载 OriginShapeShapecan also be used to承载 StorageShapeWhether属于 Origin还是 Storage, depends on itsusage contextand which interface returns it, notShapetype itselfs attribute.4.2.2 StorageShape / StorageFormat: Carrying Origin and Storage Description BodiesFrom concept perspective, StorageShapeStorageFormat很容易被 understood as only describing execution phase info; but in class system, these two types objects actually承担 stronger responsibility -they are both composite description bodies simultaneously carrying Origin and Storage two parts info.Reason需要 binding Origin and Storage in same type, fundamental reason在于 Storage itselfs complexity. Storage may introduce dimension padding, alignment etc. rules, thereby making only having execution phase shape/format insufficient to accurately describe its correspondence relationship with user semantics.Using NC1HWC0 format as example, when seeing a Tensor shaped like[8, 1, 224, 224, 16]:Its StorageFormat is NC1HWC0Its OriginFormat could be NCHW, or could be NHWCIts OriginShapes C dimension could be any value between 1~16Only from execution phase StorageShape or StorageFormat, cannot uniquely restore its corresponding semantic meaning. Only binding Origin and Storage two parts info simultaneously, can form an interpretable, stably usable complete description.Therefore, in external API context:StorageShapeandStorageFormatare closer toDescriptorsThey provide explicit access to different perspectives throughGetOrigin*()/GetStorage*()Type itself承担的是 binding and encapsulation responsibility, not direct mapping of single concept4.3 Explanation and Suggestions Regarding Class Name AmbiguityIndeed some people will confuseclass StorageShape(type name) with StorageShape concept (execution shape). This confusion comes from namings natural defect, but from modeling perspective, this class actually承担的是 simultaneously carrying Origin and Storage complete description.In practice,建议always judge what obtained through interface, not through class. Future, without破坏 existing interface compatibility前提下, can also通过 more explicit type naming to reduce understanding cost, for example:class ShapeDescriptor {...}; using StorageShape ShapeDescriptor; // Deprecated: easily confused, no longer recommended use class FormatDescriptor {...}; using StorageFormat FormatDescriptor; // Deprecated: easily confused, no longer recommended useAppendix A: Same Tensor Contrast Example Under Origin/Storage PerspectiveBelow table uses a concrete example to说明:same Tensor, how expressed info differs under different perspectives, and why need descriptor type to simultaneously carry this info.PerspectiveInterfaceExample ContentDescriptionOriginGetOriginFormat()NCHWUser semantic formatOriginGetOriginShape()[8, 3, 224, 224]User-understood ShapeStorageGetStorageFormat()NC1HWC0Actual execution used formatStorageGetStorageShape()[8, 1, 224, 224, 16]Execution phase memory形态 (含 dimension padding)CompositeGetFormat(){OriginNCHW, StorageNC1HWC0}Simultaneously carrying semantic and execution info【免费下载链接】geGEGraph Engine是面向昇腾的图编译器和执行器提供了计算图优化、多流并行、内存复用和模型下沉等技术手段加速模型执行效率减少模型内存占用。 GE 提供对 PyTorch、TensorFlow 前端的友好接入能力并同时支持 onnx、pb 等主流模型格式的解析与编译。项目地址: https://gitcode.com/cann/ge创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考