Golden Hammer Software: iphone

Showing posts with label iphone. Show all posts

Monday, November 14, 2011

Apple and non-public APIs

We recently got a rejection from Apple for using a non-public API. Google was not a big help on this one, so here's the story of how we tracked it down.

We found that your app uses one or more non-public APIs, which is not in compliance with the App Store Review Guidelines. The use of non-public APIs is not permissible because it can lead to a poor user experience should these APIs change.

We found the following non-public API/s in your app:

appName
setAppName:

Step 1: Find in files

There's a dropdown when typing into the search bar in XCode that lets you search in frameworks. Unfortunately we got no hits with this search, so that tells us the private calls are not in our code.

Step 2: Command line tools

From the rejection mail:

Additionally, one or more of the above-mentioned APIs may reside in a static library included with your application. If you do not have access to the library's source, you may be able to search the compiled binary using "strings" or "otool" command line tools. The "strings" tool can output a list of the methods that the library calls and "otool -ov" will output the Objective-C class structures and their defined methods. These techniques can help you narrow down where the problematic code resides.

otool and strings are tools that seem to be on the global path, so you can type them into the command line from any directory. The trick is you have to figure out what to run them on. XCode 4 tends to put things in strange directories, so here's a shortcut to finding the actual file you want.

Find your app under "Products" in XCode and right click on it to select "Show in Finder". This will open up a finder window where your .app file lives. If you run otool or strings on this file it will not work. Right click on your .app file and select "Show Package Contents". Then find your executable file and copy it to somewhere accessible.

Once in Terminal and in the right directory, the two commands to use are:

otool -ov ScribbleWormHD
strings ScribbleWormHD

These will output a ton of stuff. You likely won't be able to navigate it without using command-F to search for the strings that Apple complained about. In our case otool did not give us anything useful. It would have been easy to track down the problem if it had, because otool will tell you what called the offending function. Strings just outputs a list of every string used in your project. We managed to get a hit with that, so at least we had a way to test if any changes we made would fix the problem.

* NOTE: strings did not work for us at all in Snow Leopard. It only worked in Lion.

Step 3: Head Scratching

So the calls to setAppName were not in our code, and we had no tool telling us directly where the calls were. Sometimes debugging degenerates into informed trial and error. We only have two middleware libraries in Scribble Worm: Flurry and Admob. We started with Flurry on a hunch and removed it from our project as a test before running strings again. This time we were clean.

So we went to the Flurry site and found that they are on version 3.0 while we shipped with 2.8. One upgrade later and another run of strings and we were ready to submit.

Saturday, October 30, 2010

Porting from iPhone to OSX

If you've followed us at all you know by now that I like porting to new platforms, at least new platforms that support C++. Every port seems to make the base engine a bit better, and it brings in a few extra bucks to support continued development. With the news of the Mac app store coming soon I had to jump on porting the Golden Hammer engine and Big Mountain Snowboarding to OSX.

Our engine started on Windows, moved to OSX (carbon), then to iPhone/iPad, then to Android. The OSX port was never really finished because the iPhone took off. Carbon is outdated technology (see below), so for the port back to OSX I started over using Cocoa.

The port isn't fully ready for release, but within a week I was able to get the game pretty much working. This was definitely the easiest port so far.

Carbon vs Cocoa (use Cocoa)

OSX supports two different platform layers, Carbon and Cocoa. Carbon is in C, only runs on 32 bit, and does not seem to be fully supported anymore. The Carbon implementation in my engine gets a ton of deprecated code warnings, and there's a Mac App Store requirement that apps not use deprecated technology. I have not seen an official and specific announcement from Apple on Carbon, but to be safe it's better to go with Cocoa.

Also, Cocoa is almost a direct equivalent of Cocoa Touch, the API used for iPhone development. There are a few little differences that I'll note in the next section, but for the most part you can make copies of your iPhone platform layer with different includes, fix the compiler errors, and be ready to go.

Cocoa vs Cocoa Touch

As I said, Cocoa is almost exactly like Cocoa Touch. For most of my classes I was able to just make a copy of the iPhone version, add some different frameworks, rename a couple classes and be good.

The frameworks I'm currently using are:

Cocoa.framework
OpenGL.framework
ApplicationServices.framework
OpenAL.framework
AudioToolbox.framework
AppKit.framework
CoreData.framework
Foundation.framework

Most UIKit classes have an AppKit equivalent. Instead of UIView, there's NSView. Instead of CGPoint, there's NSPoint. The first line of defense on a compile error is to stick an NS in front of the class name and see if that works.

32 bit vs 64 bit

This probably won't matter to most developers, but Cocoa will compile for 64 bit systems. We're doing some behind the scenes magic with pointers and such, so I had to go through the codebase and replace a bunch of long data types with int32_t and u_int32_t, and remove some of the more questionable pointer code.

OpenGL vs OpenGL ES

For a first pass implementation, you can think of OpenGL as a near direct equivalent to OpenGL ES1. This is a horrible simplification, and I fear a lot of OpenGL users coming at me with pitchforks for saying it. A better way to think of it is ES1 is almost a direct subset of the full OpenGL, and you can get ES1 code running on OpenGL pretty easily. I have not yet ported our ES2 shader code, so I can't comment much on that aspect.

The includes you want are:

#include "OpenGL/gl.h"
#include "OpenGL/glu.h"

Make a copy of your ES1 implementation, change the includes, and hit compile. You will get a ton of compile errors. Most of them are easy to fix. For any function or variable that has "OES" at the end of it, simply delete the OES part. For any function named something like "glFrustumf", delete the "f". This will take care of 99% of the compile errors.

I'm not quite 100% sure, but I don't think PVR4 support is available. If I'm wrong on this let me know! It would save me some work. Right now I just have uncompressed textures, but DXT support seems to be available for use.

OSX Input

Modern macs support multitouch through the touchpad. The gotcha is that the mouse needs to be over the window in order for the app to receive any touch events. I'm planning on supporting keyboard inputs for those without a touchpad, and making the game fullscreen so we can always get the touch events.

You'll want the following functions in your NSView:

- (void)mouseMoved:(NSEvent *)theEvent

- (void)mouseDragged:(NSEvent *)theEvent

- (void)mouseEntered:(NSEvent *)theEvent

- (void)mouseExited:(NSEvent *)theEvent

- (void)mouseDown:(NSEvent *)theEvent

- (void)mouseUp:(NSEvent *)theEvent

- (void)keyDown:(NSEvent *)theEvent

- (void)keyUp:(NSEvent *)theEvent

- (void)touchesBeganWithEvent:(NSEvent *)event

- (void)touchesMovedWithEvent:(NSEvent *)event

- (void)touchesEndedWithEvent:(NSEvent *)event

- (BOOL)acceptsFirstResponder { return YES; }

You'll also want to call these somewhere:

[window setAcceptsMouseMovedEvents:YES];

[view setAcceptsTouchEvents:YES];

Additional platform considerations

The straight port of snowboarding runs at 2ms/frame on my 13" macbook, or 500fps. This is without making any use of multithreading or doing any real hardcore optimizations anywhere. I think any halfway decent straight port of an iPhone app should run at crazy speeds on a low end Mac.

So is a straight port good enough? I have no idea and neither will anyone else really until the Mac App Store has been out a while. Is it competing with Steam and games like Half Life 2, or will the audience want smaller simple games? Can't tell yet! I'm exporting higher res maps and doing something in the middle.

Tuesday, October 12, 2010

Converting from OpenGL ES1 to ES2 on the iPhone

I recently got through upgrading our engine to support ES2 and GLSL shaders. It took about a week to get the game just looking the same as it did before, but rendering with shaders instead. I'm sharing some info that might be worthwhile to anyone else trying to update their iPhone renderers. This is not a how-to on using GLSL to achieve different effects, you can find plenty of that elsewhere.

ES2 is not an incremental improvement to ES1, it is a total paradigm shift in how pixels get rendered. You can't just take an ES1 renderer and add a couple shaders here and there like you can in DirectX. In ES2, you write vertex and pixel/fragment shaders in GLSL, and then pass values to the shaders at runtime.

The vertex shader reads in any values from VBOs or vertex arrays, and outputs any values that are useful for the pixel shader. Any values created in the vertex shader are interpolated along the triangle edges and raster lines before being passed to the pixel shader. The pixel/fragment shader has only one job, to output a color value.

I've found this site to be a nice reference for GLSL functions, starting at section 8.1: http://www.khronos.org/files/opengl-quick-reference-card.pdf

3GS/iPhone4/iPad or bust:

ES2 is not supported on the 3G, first gen iPhone, or first gen iPod. It's possible to support both ES1 and ES2 in the same codebase, but you will need two entirely different render paths. There's no mixing and matching allowed, so within a run you are either entirely ES1 or entirely ES2.

EAGLContext* eaglContext = 0;

eaglContext = [[EAGLContext alloc] initWithAPI:kEAGLRenderingAPIOpenGLES2];

if (eaglContext && [EAGLContext setCurrentContext:eaglContext])

{

// initialize a renderer that uses ES2 imports

}

else

{

eaglContext = [[EAGLContext alloc] initWithAPI:kEAGLRenderingAPIOpenGLES1];

if (!eaglContext || ![EAGLContext setCurrentContext:eaglContext]) {

// total failure!

}

// initialize a renderer that uses ES1 imports

}

Loading and assigning shaders:

The shader compiler deals in character buffers. You will need to either create a GLSL stream in code, or more sanely load up a file containing a shader and pass the contents to the compiler.

int loadShader(GLenum type, const char* glslSourceBuf)

{
int ret = glCreateShader(type);

if (ret == 0) return ret;

glShaderSource(ret, 1, (const GLchar**)&glslSourceBuf, NULL);

glCompileShader(ret);

int success;

glGetShaderiv(ret, GL_COMPILE_STATUS, &success);

if (success == 0)

{

char errorMsg[2048];

glGetShaderInfoLog(ret, sizeof(errorMsg), NULL,
errorMsg);

outputDebugString("%s error: %s\n", fileName, errorMsg);

glDeleteShader(ret);

ret = 0;

}

return ret;

}

int loadShaderProgram(const char* vertSource, const char* pixelSource)

{

// load in the two individual shaders

int vertShader = loadShader(GL_VERTEX_SHADER, vertSource);

int pixelShader = loadShader(GL_FRAGMENT_SHADER, pixelSource);

// create a "program" which is a vertex/pixel shader pair.

int ret = glCreateProgram();

if (ret == 0) return ret;

glAttachShader(ret, vertShader);

glAttachShader(ret, pixelShader);

// assign vertex attributes to positions inside
// glVertexAttribPointer calls

glBindAttribLocation(ret, AP_POS, "position");

glBindAttribLocation(ret, AP_NORMAL, "normal");

glBindAttribLocation(ret, AP_DIFFUSE, "diffuse");

glBindAttribLocation(ret, AP_SPECULAR, "specular");

glBindAttribLocation(ret, AP_UV1, "uv1");

glLinkProgram(ret);

int linked;

glGetProgramiv(ret, GL_LINK_STATUS, &linked);

if (linked == 0)

{

glDeleteProgram(ret);

outputDebugString("Failed to link shader program.");

return 0;

}

return ret;

}

void drawSomething(void)

{

// tell opengl which shaders to use for rendering

glUseProgram(mShaderProgram);

// set any values on the shader that you want to use.

// set up the vertex buffer using glVertexAttribPointer
// calls and the same positions used during the linking.
// then draw like usual.

glDrawElements(GL_TRIANGLES, numTris, GL_UNSIGNED_SHORT, 0);

}

No matrix stack:

All transformations are done in the shader, so anything using glMatrixMode is automatically out. glFrustumf and glOrthof are also gone, so you will need to write replacements. You can find examples of these two functions in the Android codebase at http://www.google.com/codesearch/p?hl=en#uX1GffpyOZk/opengl/libagl/matrix.cpp&q=glfrustumf%20lang:c++&sa=N&cd=1&ct=rc&l=7.

For the transforms used by shaders, I have callbacks to grab values like ModelToView and ViewToProj from a structure that I calculate once per render pass.

In C++:

unsigned int transformShaderHandle = glGetUniformLocation(shaderId, "ModelToScreen");

glUniformMatrix4fv(transformShaderHandle, 1, GL_FALSE, (GLfloat*)mModelViewProj );

In the vertex shader:

uniform mat4 ModelToScreen;

attribute vec4 position;

void main()

{

gl_Position = ModelToScreen * position;

}

More textures!

You only get two texture channels to use under ES1. ES2 gives you 8. Setting up a texture in ES2 is similar to ES1, but you don't get the various glTexEnvi functions to define how multiple texture channels blend together. You do that part in GLSL instead.

In C++:

unsigned int textureShaderHandle = glGetUniformLocation(shaderId, "Texture0");

// tell the shader that Texture0 will be on texture channel 0

glUniform1i(textureShaderHandle, 0);

// then set up the texture like you would in ES1

glActiveTexture(GL_TEXTURE0);

glEnable(GL_TEXTURE_2D);
glBindTexture(GL_TEXTURE_2D, mTextureId);

In the pixel shader:

uniform lowp sampler2D Texture0;
void main()
{
gl_FragColor = texture2D(Texture0, v_uv1);
}

No such thing as glEnableClientState(GL_NORMAL_ARRAY)

GL_NORMAL_ARRAY, GL_COLOR_ARRAY, etc have all gone away. Instead you use the unified glVertexAttribPointer interface to push vertex buffer info to the shaders. This is a pretty simple change.

glEnableVertexAttribArray(AP_NORMAL);

glVertexAttribPointer(AP_NORMAL, 3, GL_FLOAT, false, vertDef.getVertSize(), (GLvoid*)(vertDef.getNormalOffset()*4));

Tuesday, September 28, 2010

100k animated triangles at 30fps on iPhone

I've been optimizing OverPowered all week, and have managed to more than double the amount of bad guys we can support, and removed a 10 meg memory spike on load that was sending us low memory warnings. Here is some info that I've picked up in the process.

The strategy: Find the bottlenecks, kill the bottlenecks.

The iPhone is actually two processors, the CPU and the GPU. If one of those is eating up all the time, then it's not worth optimizing the other. OverPowered is an action game, so it's important to keep the framerate at least 30fps. At the start of the week we were overwhelmingly GPU bottlenecked at around 20k animated triangles per frame. I managed to narrow down enough fixes that it's now more worthwhile to optimize the CPU than to try to keep shrinking the GPU cost.

I used 4 tools this week.

An onscreen FPS counter with number of triangles drawn. This is the only really definitive way to know how fast your game is running, but it's extremely low resolution. The iPhone is vsync locked, so if you are at 30fps it will take a huge change to make it display anything else.
An in-game timer. I wrap timer calls around various functions to measure their real cost. This is a very useful way to get a high level view of where your frame time is going with a high degree of confidence. It can be run in release without much profiling overhead to skew the results. I have it spitting out the frame breakdown every time I exit a level.
Instruments: CPU sampler. This is a fairly lightweight sampling profiler. As long as you sanity check the results with the in-game timer it can be used to get a higher resolution view of bottlenecks.
Instruments: Allocations. This tool is absolutely awesome for telling you where your memory is being spent. All platforms should have a tool like this.
I did not use Shark. This can give you a better view than the CPU sampler, but it's much heavier weight. It takes longer to get results and try out changes. It's good if you have a specific set of functions that you really want to optimize at a low level.

Pixel fill rate:

The amount of pixels drawn seems to be the biggest deal on this platform. I read somewhere that you can draw the full screen about 5 times at 30 fps on the 3GS if nothing else is going on, and my own tests are about the same. If you draw a background image, then the ground, then a gui and a bunch of little objects you can easily be drawing the screen three times already if you set it up wrong.

The iphone supports a fast hidden surface removal with the deferred tile renderer. Opaque objects drawn on top of each other largely avoid the overdraw issue by doing an early cull of objects that will be fully drawn behind other objects within a tile. So…make your gui out of opaque rectangular textures? This isn't really an option.

Just be aware of the limitations on fill rate and design appropriately is all the advice I can give on this one. If your game design requires 10 fullscreen blended textures to be drawn on top of each other every frame, it's not going to work no matter how much work you do. Try to avoid drawing large blended textures if possible, and a large alpha-tested object is one of the worst things you can do for rendering performance.

I was trying to do an effect that draws the entire world to an offscreen buffer, then overlays that on the screen for pixel shader effects. I ended up having to abandon this approach after getting it working due to the pixel fill rate getting in the way.

Vertex upload speed:

When you use vertex arrays, the entire vertex buffer is uploaded to the GPU every frame. This causes the whole pipeline to stall out while the GPU waits for the memory transfer. VBOs can be used to eliminate this lag for static data, things that you rarely change. All things being equal the difference between vertex arrays (20k verts) and VBOs (100k verts!) is huge. The max number of verts you can push is probably much higher, but I have a game running and lots of pixel overdraw.

Our scene is now entirely static buffers. We use a vertex shader to do all of the animation on the GPU by passing static buffers that represent the frames to interpolate between, and a float argument to represent position in between the frames. This change alone let me double the amount of onscreen badguys.

It's also the reason why we won't be able to support the original iphone and 3g for OverPowered. The difference in power is too much to be able to max out the newer phones while still trying to run on the older ones for a small company. When I pause the game I can fill the screen completely with quake 3 models without dropping below 30fps, so the bottleneck has been moved away from the rendering code and into the game/physics/render-setup code.

The vertex processor seems to be very powerful compared to the rest of the pipeline. I have not seen any slowdown from making the vertex shader more complicated so far, so I plan on abusing this as much as possible. Here is the relevant part of my shader code.

uniform mat4 ModelViewProj;

// the position of the low frame

attribute vec4 position;
// the position of the high frame from a different buffer

attribute vec4 diffuse;
// the pct of progress the animation has run between the two frames

uniform mediump float PctLow;

void main()

{

vec4 interpolatedPos = mix(position, diffuse, PctLow);

gl_Position = ModelViewProj * interpolatedPos;

}

Texture size is important:

I reduced a gui texture from 256x256 png to 64x64 and saw good results. This is a texture that's drawn a bunch of times every frame. All of our opaque textures are pvr4 compressed, and that was also a big win over uncompressed textures.

Memory spikes in ObjectiveC:

The 10 meg spike that I mentioned was in our platform layer due to bad use of the garbage collector. We generally garbage collect at the end of every frame. Usually we can get away with this because most of our allocations are in C++, and they go away as soon as we tell them to. Texture loading is an exception because it happens in platform code.

During the course of loading a level, we'd read in a texture buffer, create the OpenGL texture, and then release the texture buffer. This works ok if only one texture is loaded that frame, but not if we are loading a whole level at once. If you identify a place like this in ObjectiveC code, an easy fix is to put a NSAutoreleasePool around it.

NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];

// do stuff

[pool release];

OpenGL State Changes:

These don't seem to be causing me any troubles right now. Our renderer has always been pretty good at batching materials, so it's not an issue I had to touch on this week.

Saturday, August 28, 2010

Supporting iPhone and Android in the same codebase

I've noticed that a lot of people are surprised we have a cross platform engine for iPhone and Android. It also runs on Windows and Mac. Within a short period we could have any of our iPhone games running on any of those other platforms. I thought it would be worthwhile to go into some details about how we did it.

Engine and Platform Philosophy:

The Golden Hammer engine was written from the get-go to be cross platform. Keeping other platforms in mind early on will save a lot of effort when it's time to port. This really boils down to three things:

C++ is used for the bulk of the code.
Interface classes are used to wrap major systems.
No direct calls to the operating system are allowed outside of a platform wrapper.

C++ is supported on PC, Mac, iPhone, Android, XBox, Wii, PS3 and the list goes on and on. The operating system calls on iPhone are in Objective C, but the rest of your code can be in C++. Likewise with Java on Android through use of the NDK.

Major systems like rendering are wrapped in a high level interface. We support DirectX, OpenGL, and OpenGL ES renderers. Keeping the interface high level keeps the nasty high frequency virtual calls at bay when setting render states. For iPhone and Android you can get away with just an OpenGL ES renderer, but not being locked out of platforms like the Wii later on is nice to have. Sound is a place where you will need an interface in order to successfully port from iPhone to Android.

Platform function calls are all abstracted behind interfaces. On the iPhone this means that Objective C calls are not allowed anywhere in the game code outside a minimal set of classes to represent the platform. On Android this means that the game is not allowed to make any JNI calls except from within the platform layer.

Our engine is about 60,000 lines of platform-independent C++ code, with the platform interface layer running about 2,000 lines. In theory we don't need to modify any of those 60k lines in order to add support for a new platform.

Linking C++ and Operating System calls:

Calling Objective C code from C++ is super easy though the use of .mm files, which let you compile both Objective C and C++ within the same class. Provide a C++ interface for the game to use, and then subclass it with a .mm file to make the platform calls.

Communicating between C++ and Java is a bit more work with JNI as your only choice. You create a Java function on one side, and a C function on the other to act as the glue code. Then subclass your C++ interface to call your new wrapper functions. I recommend doing some web searches on JNI for details.

Handling Input:

There are many ways to set up your interface to the platform input. I don't strongly advocate one over another, but I can explain how we do it. The platform code never directly causes the game to do anything. We maintain an input state class with all the current key positions, accelerometer values, etc. When an accelerometer call comes in we update the input state. On the next frame the game looks at the updated input and makes decisions. On iphone this has the added bonus of never doing much work in response to an input event, which prevents the operating system buffer from overflowing and losing events.

The iPhone/Android Platform Layer:

Texture loading: On iPhone we use custom code for PVR loading, and UIImage for uncompressed textures. On Android we use BitMap.
Sound: On iPhone we use a platform independent OpenAL library with platform code for loading the files for effects, and AVAudioPlayer for music. On Android we tell Java code to start playing a sound.
File Access: This is a simple .mm interface on iPhone, but on Android we need to ask Java for a file handle that we can then read in C++.
Handling OS input events.
Creation of the OpenGL context.
Time calculation.
Threads.
Debug output.

Gotchas:

Android does not have breakpoints for native code. This means you are stuck with printf debugging while doing the port.

Setting up the Android NDK environment is time consuming. Essentially you need to download and set up the NDK, create makefiles, and build your C++ through the command line. You then link into the native library in your eclipse project.

Hardware fragmentation on Android is a real problem, even when ignoring the performance differences between a Galaxy and a Hero. All iPhones support OpenGL 1.1, but some Android phones are still on 1.0. The Hero can't generate mipmaps on the GPU for example. Be prepared to have to test on a wide variety of phones.

File size matters on Android phones before 2.2. The phones have a limited amount of space available for apps, because they can't be installed on the much larger SD card. If your app is too large, owners of older phones will uninstall quickly.

There's is no standard compressed texture format on Android. If you choose to include PVR files, you will also need to include an uncompressed fallback texture. This can be a problem when considering the file size.

Sounds on Android are slow, especially for starting and stopping. Maybe I'm missing something in my implementation, but I had to remove several sound effects during the port for performance reasons.

The NDK does not come with STL included. You will need to use something like STLPort.

The iPhone and Android audiences are not the same! Be prepared to have a new marketing challenge once your port is complete.

Final Notes:

Several of the topics here could easily be expanded into their own posts. If you'd like to see more detail on anything please let me know.